Device system and method for determining document similarities and differences

ABSTRACT

A device, system and method of outputting information is disclosed to compare at least two documents, namely, at least a first document and a second document, to facilitate visual mapping and comparison of these documents. These documents comprise document subsections and the subsections comprise document subsection headers associated therewith. At least one of the first document subsection headers is juxtaposed relative to an output of second document subsection headers mapping thereto, to visually emphasize a header mapping. This header mapping is established by: mapping the first document subsections relative to the second document subsections based on identifying substantial similarities therebetween, to establish a subsection mapping therebetween; and, in relation to the subsection mapping and the association between the document subsections and the subsection headers, further mapping the first document subsection headers relative to the second document subsection headers. Several closely-related devices, systems and methods for outputting information to compare at least two documents, establishing a mapping to compare at least two documents, and highlighting similar text segments, are also disclosed.

BACKGROUND OF INVENTION

This disclosure relates to the field of computer document processing,and particularly to enhancing the ability of a computer user to comparetwo or more documents and quickly understand the similarities anddifferences between or among these documents.

There are many circumstances in today's electronic world where it isdesirable to compare two or more documents with one another. Writers ofall stripes frequently wish to compare one version of a document withanother version of a document, to see what similarities and changesexist as between two different versions of the same document. Legalprofessionals often need to compare different documents to see how theydiffer, whether these be, for example, draft contract proposals from twoor more parties, or two different patents or patent applications.Lawmakers similarly need to compare various competing proposals forlegislation, and to pinpoint what is different and what is the samebetween or among two or more often extremely lengthy and unwieldyproposals. A copyright attorney may be in the position of comparing twowritten works, to see if one has “copied” the other sufficiently toconstitute an infringement. And, in a myriad of other situations, theneed to compare documents and quickly pinpoint their similarities anddifferences, has grown to near ubiquity in today's electronic world. Itis important to understand that this goes beyond and is independent ofmerely managing different versions of the same document. Thisencompasses the situation in which it is simply necessary to compare twodocuments—whatever their origins—for differences and similarities.Editing the document may also be done after or in conjunction with thiscomparison, by one person or by a group of people, but in manysituations, the comparison may be the end result in and of itself.

Perhaps the most familiar method used to compare documents is theso-called “underline strikeout” method, such as is employed in thewidely-used Microsoft® Word program. In this method, a first document iscompared to a second document, and those words or phrases that aredeleted in going from the first document to the second document arehighlighted with a “strikeout” indicator on a computerized outputdevice, while those that are added from the first to the second documentare highlighted with an “underline” indicator on the output device.

This approach has many drawbacks, some of which will be described here.First, the two documents are not separately presented, but are mergedinto a single document rather than in distinct, juxtaposed windows.Thus, the user viewing the output presentation can often become confusedabout what was in the first document versus the second document. Also,the comparison method itself is generally serial, front-to-back, whichmakes it difficult for the user to identify when text has simply beenmoved from one place to another. Frequently, if a segment of text ismoved, it will appear as a strikeout (deletion) from the first document,and an underline (addition) to the second document at an entirelydifferent location. This gives the user no clue that this segment wasactually moved, or where it was moved from and to. Further, for astructured document with multiple headings as well as a hierarchicalheader structure, the structural meaning of the headings is ignored, andthese document headers are treated just like any other items of text.The entire document is outputted en-masse, and the user has no way tostart from a header-based “table of contents” and simply “drill down”into the document sections that are of most interest for comparison.There are no suitable statistical or similar comparison summaries ofsimilarities and differences, either for the whole document, or forvarious document substructures, to aid the user in navigating over tothe subsections of greatest interest. Additionally, there is no contextinformation outputted in conjunction with the document text to enablethe user to immediately determine where the outputted text fits in thecontext of the overall document structure. Finally, individualsubsections are not in any way “mapped” to one another beforecomparison, so that in deciding what to underline and what to strikeout,the computer processor's determination that something in the firstdocument is “different” than something in the second document may beerroneous, because it is not comparing the right document subsectionswith one another following an appropriate subsection mapping.

A moderate enhancement to traditional underline strikeout methods isachieved by Workshare® in its DeltaView® document comparison software.Most significantly, if a segment of text is moved, it is identified assuch, rather than simply as a deletion from the first document and anaddition to the second document in some unrelated and not-indicatedlocation. In particular, the move is still identified as a deletion fromthe first document and an addition to the second document, but is givena highlighting different from highlighting given to an ordinary deletionand addition, so as to specifically identify it as a move. This is stillproblematic, however, since it does not link the old location of themoved material in the first document to its new location in the seconddocument. If there are multiple items of text that are moved or if thedocuments are long documents and the text is moved far from its originallocation, the benefit of this feature is lost since the moves willsimply get lost amidst one another or the user will have to scrollthrough a large segment of text to find where the original text wasmoved to. At most, this is a move-enhanced form of underline strikeout,which otherwise diverges very little from conventional underlinestrikeout.

In addition to a window outputting the above-discussed enhancedunderline / strikeout information, DeltaView® also has two furtherwindows, one showing the first document, and the other showing thesecond document. However, these two windows show these two documents inclean, unmarked form, and do not in any way highlight changes as betweenthe first and second documents. All information about the changes mustbe gleaned from the third, enhanced underline/strikeout window. It wouldbe preferable, and would present a much simpler and easier to useoutput, if the enhanced underline/strikeout window were to be omitted,and if all of the highlighting information summarizing similarities anddifferences between the two documents were to be presented in only twowindows, one for the first document, and one for the second, rather thanin the three windows required for the DeltaView® presentation.

Finally, as with Microsoft® Word and similar software, DeltaView®entirely ignores the document structure, and document headers aretreated just like any other items of text. As a result, the entiredocument is again outputted en-masse, and the user has no way to startfrom a header-based “table of contents” and simply “drill down” into thesections of the documents that are of most interest for comparison,aided by suitable statistical or similar comparison summaries ofsimilarities and differences, either for the whole document, or inassociation with various document substructures. Additionally, there isno context information outputted in conjunction with the document textthat would enable the user to immediately determine where the outputtedtext fits in the context of the overall document structure. Finally,individual subsections are not in any way “mapped” to one another beforecomparison, so that in deciding what to underline and what to strikeout,the computer processor's determination that something in the firstdocument is “different” than something in the second document may beerroneous, because it is not comparing the right document subsectionswith one another following an appropriate subsection mapping.

Microsoft® WinDiff outputs a document comparison that is almostunintelligible to a novice computer user. As with theunderline/strikeout method, the two documents are not separatelypresented, but are merged into a single document rather than indistinct, juxtaposed windows. But the comparison is even more unwieldythan an underline/strikeout comparison, and if a single word differs, itregards the entire line as differing. WinDiff requires the use of anadditional window that awkwardly shows a visual output of parallel,interconnected lines representing a map of each document on aline-by-line basis, based on a three-color highlighting schemeindicating first document text not in the second document, seconddocument text not in the first document, and text in both documents.Connecting lines are used to connect text that appears in bothdocuments, including moved text, as between the individual documentmaps. The whole approach is awkward and non-intuitive at best. WinDiffalso ignores the document structure, and document headers are treatedjust like any other items of text. Thus, WinDiff contains all of theother deficiencies earlier noted with respect to Microsoft® Word andWorkshare® DeltaView® that relate to this ignoring of the documentstructure.

Norton Utilities® File Compare represents something of an improvementover Microsoft® Word and Workshare® DeltaView®, because it entirelyforegoes the underline strikeout methodology wherein two documents aremerged into a single document for output with underlines and strikeouts,in favor of a side-by-side output of the two documents being compared.In contrast to DeltaView®, this does contain highlighting informationsummarizing the differences between the documents. That is, NortonUtilities® File Compare does omit the third window required byDeltaView®. However, Norton Utilities® File Compare still has a numberof drawbacks.

First, the highlighting of similarities and differences do not occur ata word-by-word level, but appears to occur on a line-by-line or asegment-by-segment basis, and so tend to greatly overstate the degree ofdifference between the documents, and to greatly understate the degreeof similarity between the documents. If perhaps two or three words in aten or twelve word segment of text have been altered, Norton Utilities®File Compare will highlight the entire ten or twelve word segment ashaving been altered.

Further, Norton Utilities® File Compare, like DeltaView ®, doesdistinguish text moves from additions and deletions. However, it doesnot actually move the output of text in the second document to match thesequencing of the text in the first document, nor does it allow for anyform of active highlighting wherein the use can simply designate text inone document and find out where that same text exists in the otherdocument. Instead, in the first document output, it inserts the phrase“moved from line x” opposite the pertinent second document text, where xis the line number in the first document where that same textoriginates. Similarly, in the second document output, it inserts thephrase “moved to line y” opposite the pertinent first document text,where y is the line number in the second document to which that text hasbeen moved. An actual move of the second document text to juxtapose withthe first document text perhaps with some indication of where the movedtext originated or of the fact that the text was moved and/or some formof active highlighting, would actually render the understanding of thetext move less confusing.

Finally, Norton Utilities® File Compare, like Microsoft® Word,Microsoft® WinDiff, and Workshare® DeltaView®, also entirely ignores thedocument structure, and document headers are treated just like any otheritems of text. Thus, Norton Utilities® File Compare contains all of theother deficiencies earlier noted with respect to Microsoft® Word,Microsoft® WinDiff, and Workshare® DeltaView® that relate to thisignoring of the document structure. The entire document is againoutputted en-masse, and the user has no way to start from a header-based“table of contents,” and to simply “drill down” into the sections of thedocuments that are of most interest for comparison, aided by suitablestatistical or similar comparison summaries of similarities anddifferences, either for the whole document, or in association withvarious document substructures. There is again no context informationoutputted in conjunction with the document text that would enable theuser to immediately determine where the outputted text fits in thecontext of the overall document structure. And again, individualsubsections are not in any way “mapped” to one another beforecomparison, so that in deciding what to mark as similar and different,the computer processor's determination that something in the firstdocument is “different” than something in the second document may beerroneous, because it is not comparing the right document subsectionswith one another following an appropriate subsection mapping.

SUMMARY OF INVENTION

Disclosed herein is a device, system and method of outputtinginformation to compare at least two documents, namely, at least a firstdocument and a second document, to facilitate visual mapping andcomparison of these documents. These documents comprise documentsubsections and the subsections comprise document subsection headersassociated therewith. At least one of the first document subsectionheaders is juxtaposed relative to an output of second documentsubsection headers mapping thereto, to visually emphasize a headermapping. This header mapping is established by: mapping the firstdocument subsections relative to the second document subsections basedon identifying substantial similarities therebetween, to establish asubsection mapping therebetween; and, in relation to the subsectionmapping and the association between the document subsections and thesubsection headers, further mapping the first document subsectionheaders relative to the second document subsection headers.

Also disclosed is a related device, system and method of outputtinginformation to compare at least two documents, namely a first documentand a second document, to facilitate visual mapping and comparison ofthe documents. At least one subsection of the first document isjuxtaposed relative to an output of a corresponding at least onesubsection of the second document to visually emphasize a comparisontherebetween. Differences established on a word-by-word basis betweenthe outputted first and second document subsections are highlighted withdifference highlighting, and similarities established on a word-by-wordbasis between the outputted first and second document subsections arehighlighted with similarity highlighting different from the differencehighlighting.

Also disclosed is a related device, system and method of outputtinginformation to compare subsections within a document. A plurality ofsubsections of the document are outputted. At least two selected ones ofthe document subsections are juxtaposed relative to one another, tovisually emphasize correspondences therebetween, in response tocomparison selecting the selected document subsections for comparisonwith one another. Differences identified among the selected documentsubsections are highlighted with difference highlighting, andsimilarities identified among the selected document subsections arehighlighted with similarity highlighting different from the differencehighlighting.

Also disclosed is a related device, system and method of establishing amapping to compare at least two documents, namely, at least a firstdocument and a second document. These documents comprise documentsubsections and the subsections comprise document subsection headersassociated therewith. The first document subsections are mapped relativeto the second document subsections based on identifying substantialsimilarities therebetween, to establish a subsection mappingtherebetween.

Also disclosed is a related device, system and method of highlightingsimilar text segments. A first selected text segment is similarityhighlighted on a computerized output device in response to selecting thefirst selected text segment using a computerized input device.Simultaneously, at least one other text segment similar to the firstselected segment is also similarity highlighted in response to saidselecting the first selected text segment.

BRIEF DESCRIPTION OF DRAWINGS

The features of the invention believed to be novel are set forth in theappended claims. The invention, however, together with further objectsand advantages thereof, may best be understood by reference to thefollowing description taken in conjunction with the accompanying drawingin which:

FIGS. 1 and 2 respectively illustrates the text of a sample firstdocument and a sample second document. This disclosure uses a comparisonof these sample documents as the basis for illustrating the Device,System and Method for Determining Document Similarities and Differencesdisclosed herein.

FIG. 3 is a sample screen output illustrating an output of the samplefirst and second document headers in a fully collapsed (unexpanded)state, in which the first document is dominant and the second documentis subservient.

FIG. 4 is a sample screen output illustrating a first level hierarchicalstructural mapping of the two sample documents to one another.

FIGS. 5 and 6 are sample screen outputs illustrating the furtherexpansion of the hierarchical structural mapping of the two sampledocuments to one another.

FIG. 7 is a fully collapsed sample screen output similar to FIG. 3, butin which the second document is now dominant and the first document isnow subservient.

FIG. 8 is a sample screen output similar to FIG. 4 illustrating a firstlevel hierarchical structural mapping of the two sample documents to oneanother, with the second sample document being dominant.

FIG. 9 is sample screen output illustrating the further expansion of thehierarchical structural mapping of FIG. 8.

FIG. 10 is a sample screen output illustrating the comparison of a textsection of the first document to a text section of the second documentmapped thereto, following selection of those text sections by a computeruser.

FIG. 11 is a sample screen output illustrating the active highlightingof a selected similarity between the document subsections of FIG. 10.

FIG. 12 is a sample screen output illustrating a substitution listgenerated from the comparison outputted in FIG. 10.

FIG. 13 is a sample screen output illustrating the comparison of thefirst document to the second document, similar to FIG. 10, but at thehierarchical level of a section that encompasses several subsectionsthereof.

FIG. 14 illustrates an intra-document variation of the invention.

FIG. 15 is a screen shot taken from a prototype reduction to practice ofthe invention, showing the “mapped” (table of contents) view of FIGS.3-9 for a complex legal document with a great deal of substructure.

FIG. 16 is a screen shot of taken from the sa me prototype reduction topractice of FIG. 15, showing the comparison view of FIGS. 10, 11 and 13for this more complex legal document.

DETAILED DESCRIPTION

To more fully illustrate the invention, FIGS. 1 and 2 display twosomewhat-differing versions of a document which are to be compared usingthe invention disclosed herein. These sample documents in FIGS. 1 and 2are hypothetical patent documents describing and claiming a “chair.”While the device described in these documents certainly is not novel inlight of current- day art, the documents themselves provide a goodillustration of the various features of the document mapping andcomparison and related functionality that comprises the inventiondisclosed herein.

We will begin by discussing the differences and similarities betweenthese two documents in detail. Then, we shall illustrate how thesedifferences and similarities can be quickly and easily pinpointedaccording to the invention disclosed herein.

The two illustrative documents of FIGS. 1 and 2 are structureddocuments, which is to say that there is a definitive structureassociated with these documents. Some of this structure is explicit inthe form of various subsection headers that are part of the documentsthemselves. Some of this structure is implicit and is thus derived fromthe document itself, such as the tree structure inherent in a series ofpatent claims that are dependent one from the other. As will bediscussed at length herein, some of the implicit structure is coercedinto the documents through the actual comparison of the documentsthemselves.

These two illustrative documents are purposefully designed to havecertain similarities as well as certain differences, since it is a keyobject of the invention to allow a user of this invention to look at acomputerized output device such as a computer screen and quicklynavigate through the documents and engage in a visual comparison of thedocuments to rapidly pinpoint the differences and similarities betweenthe documents, as well as how extensive such differences andsimilarities are.

Structurally, it is to be noted that the headers are somewhat differentbetween the “first document” 1 shown in FIG. 1 and the “second document”2 shown in FIG. 2. The title of first document 1 is “CHAIR,” while thatof second document 2 is “IMPROVED SEATING DEVICE.” In first document 1,the first subsection 101, in its header 102, is explicitly titled“Background of the invention,” while in second document 2, the firstsubsection 201 is explicitly titled, in its header 202, simply,“Background.” For these first subsections 101 and 201, includingsubsection headers 102 and 202, there are a total of five documentsegment differences between these subsections, encompassing eight wordsin first document i that are not in second document 2 and 11 words insecond document 2 that are not in first document 1. These five documentsegment differences are as follows: First, as noted, the segment “of theinvention” is deleted from the subsection title going from firstdocument 1 to second document 2. Second, the segment “need” in firstdocument 1 is replaced in second document 2 by the segment “become tiredand would like.” Third, the segment “Beds enable” from first document 1is replaced by “A bed enables” in second document 2. Fourth, the segment“to” is added to second document 2 between the words “not” and “sit.”Fifth, the segment “a halfway” in first document 1 is replaced by thesegment “an intermediate” in second document 2. Although literallydifferent, the segment “Beds enable” is similar to the segment “A bedenables” aside from one being in a singular and the other in a pluralform. Thus, by a suitable use of “stemming” techniques, these segmentsmay in fact be regarded as similar, not different, for purposes ofmapping these documents to one another. This is the first of severalcases to be discussed where a literal difference in one respect may infact be regarded as a similarity in a different respect.

Including the headers 102 and 202 of these first subsections 101 and201, there are also a total of six document segment similarities betweenthese subsections, encompassing a total of 44 identical words thatappear in these first subsections in both first document 1 and seconddocument 2. Thus, if one tallies up the differences and similaritiesbetween the first subsections 101 and 201 including headers 102 and 202,of first 1 and second 2 documents, for example, by counting words, thereare a total of 19=11+8 differences, and 88=44+44 similarities. Hence bythis “word counting” similarity measure (which is but one example of howto measure this), these subsections sections are 88/(88+19)=88/107=82.2%similar. This similarity measure is even higher if stemming is employedto find a match between “Beds enable” and “A bed enables.”

It is important to note, however, that it is not a foregone conclusionat the outset that the first subsection of first document 1 will map tothe first subsection of second document 2. For example, the secondsubsection 103 of first document 1 has a subsection header 104 titled“Objects of the invention,” while the second subsection 203 of seconddocument 2 has a subsection header 204 titled “Figure.” Not only arethese subsection headers different, but the content of these subsectionsis also entirely different. In first document 1, the body of the secondsubsection reads “It is therefore an object of the invention to providea device for a person to sit in an intermediate position betweenstanding, and lying down or sitting on a rug,” while in second document2, the body of the second subsection reads “FIG. 1 shows a chairaccording to the preferred embodiment of the invention.”

An ordinary “underline-strikeout” program encountering this differencewould simple conclude that “Objects of the invention/It is therefore andobject of the invention to provide a device for a person to sit in anintermediate position between standing, and lying down or sitting on arug” was replaced with “Figure/FIG. 1 shows a chair according to thepreferred embodiment of the invention,” and would strikeout the formerand underline the latter. However, what has really happened here is thatthe subsections of the two documents are somewhat out of sequencerelative to one another. (The “/” is used to separate the subsectionheader from the subsection text.)

Particularly, in first document 1, the body of second section 103 reads“It is therefore an object of the invention to provide a device for aperson to sit in an intermediate position between standing, and lyingdown or sitting on a rug.” In second document 2, the phrase “person tosit in an intermediate position between standing, and lying down orsitting on a rug” also appears. But it is in a separate sentence at theend of the third subsection 205 with the explicit header 206 titled“Summary,” and does not have its own explicit header. It is important tohighlight this information to the user. Conversely, second subsection203 comprising the sentence “FIG. 1 shows a chair according to thepreferred embodiment of the invention” appears absolutely verbatim inboth first document 1 and second document 2, but in first document 1, itis in the fourth section 105 under the fourth section header 106“Drawings,” while in second document 2, it is in second section 203under the second header 204 “Figure.” It is important to also highlightthis information to the user.

In terms of similarities and differences, the second subsection 103 offirst document 1 has absolutely no similarity with and is completelydifferent from the second subsection 203 of second document 2. However,it does map quite well to the sentence at the ends of the thirdsubsection 205 titled “Summary” in second document 2. In particular, Thesegment “It is therefore an object of the invention to provide a devicefor” (13 words) is deleted in going from document 1 to document 2, butthe remaining segment “a person to sit in an intermediate positionbetween standing, and lying down or sitting on a rug” (18 words) it leftfully intact without change. Further, it is the fourth subsection 105 offirst document 1, explicitly entitled with the fourth section header 106“Drawings,” that maps with complete similarity and no difference to thesecond subsection 203 of second document 2, with the second sectionheader 204 explicitly entitled “Figure.” If one tallies the wordsimilarities and differences between “Drawings/FIG. 1 shows a chairaccording to the preferred embodiment of the invention” from the fourthsubsection 105/106 of first document 1 with “Figure/FIG. 1 shows a chairaccording to the preferred embodiment of the invention” from the secondsection 203/204 of second document 2, the word “Drawings” is deletedfrom first document 1 and replaced by the word “Figure” in seconddocument 2, (2 differences) while all 13 words of the subsectionsthemselves (105 and 203) remain unchanged (26 similarities). This, bythe similarity measure earlier discussed, these are 26/28=92.9% similarand ought to be mapped together with one another.

FIG. 3 shows a top level contracted (unexpanded) structural comparisonof first document 1 with second document-2. All that is outputted oncomputerized output device 14 is the title 301 of first document 1juxtaposed horizontally to the left of the title 302 of second document2, and a “+” expansion button 303 (or similar expansion means) that auser can select with a computerized input device, for example, bypointing and clicking a mouse, in order to expand this output down tothe next hierarchical level.

By clicking the “+” expansion button 303 in FIG. 3, the user arrives ata first level hierarchical structural mapping of the two documents toone another as shown in FIG. 4. This click on expansion button 303simultaneously expands the output of both the first document subsectionheaders 12 and the second document subsection headers 13. Were one toclick on the “−” contraction button 401 to the left of “CHAIR,” thisoutput would contract the first and second document header outputssimultaneously, back to the outputs of FIG. 3. This type of expansionand contraction is widely practiced in the prior art for a singlestructure, i.e., for a tree of directories on a computer or for a singlestructured document. But the simultaneous expansion and contraction oftwo parallel structures being compared to one another does not appear tobe disclosed or suggested in the prior art.

Referring now to FIG. 4, it is first to be noted that the user, inmaking the document comparison, needs to select one of the two documentsas the “dominant” document, and the other as the “subservient” document.In FIG. 4, first document 1 has been selected to be dominant, and seconddocument 2 has been selected to be subservient. This is needed because,to the extent that the documents are out of sequence with one another(or have different structures, to be discussed shortly), the dominantdocument is used to establish the sequence (and structure) for output,while the output of the subservient document is resequenced (andrestructured) to match the output of the dominant document. That is, theoutputs of the first document subsection headers 102, 104, 106 etc. arejuxtaposed relative to the outputs of the second document subsectionheaders 202, 204, 206 etc. to visually emphasize the mapping 4 betweenthese subsection headers, in particular, by resequencing (andrestructuring) the subservient document to match the original sequence(and structure) of the dominant document.

Let us examine how this addresses the resequencing discussed earlier. InFIG. 4, the output of the first document subsection header “Backgroundof the invention” (102) is juxtaposed horizontally with the output ofthe second document header “Background” (202). This is based on thedetermination made earlier that these sections match up with a highdegree of similarity (82.2% using the similarity measure discussedabove), or a similar type of determination that would be apparent tosomeone of ordinary skill and which is considered to fall within thescope of this disclosure and its associated claims. No resequencingneeds to occur for these subsections.

However, it is now to be observed that the output of the first documentsubsection header “Objects of the invention” (104) has juxtaposed to itsright, an output of the second document subsection header “Summary”(206). This tells the user that whatever text material appears in thesubsection body (103) under the header “Objects of the invention” (104)in first document 1 maps most closely to the text material that appearsunder the header “Summary” (06) in second document 2, even though theseare out of sequence in the original documents 1 and 2. In particular,this has occurred because prior to outputting the header juxtapositionsin FIG. 4, the underlying computerized system has already performed amapping of the first document subsections 101, 103, 105 and othersections not expressly given a reference number, relative to the seconddocument subsections 201, 203, 205 and other sections also not expresslygiven a reference number, to determine which headers 102, 104, 106, andothers without reference numbers in first document 1 relate most closelyto which headers 202, 204, 206 and others without reference numbers insecond document 2. Then, the output of the first document subsectionheaders 102, 104, 106 etc. in juxtaposition to the second documentsubsection headers 202, 204, 206 etc. occurs such that the seconddocument subsection headers 202, 204, 206 etc. are resequenced andrestructured prior to output, irrespective of their original sequencing,and irrespective of their original structuring. In differentterminology, the sequence and structure of the subservient document is“coerced” to take on the sequence and structure of the dominantdocument. During this mapping, the computerized system ascertained ahigh degree of similarity between the header/subsection 104/103 of firstdocument 1 and header/subsection 206/205 of second document 2 based onthe identical phrase “person to sit in an intermediate position betweenstanding, and lying down or sitting on a rug” appearing in bothdocuments, as discussed earlier. As a consequence, the computerizedsystem causes the “Objects of the invention” header 104 of first(dominant) document 1 next to be outputted to the “Summary” header 206of second document 2, by upwardly resequencing the “Summary” header 206of the second (subservient) document in the FIG. 4 output.

Note that for second document 2, in addition to resequencing, a form ofrestructuring has also taken place, and an implicit header has beenadded. In particular, the sentence “It is therefore an object of theinvention to provide a device for a person to sit in an intermediateposition between standing, and lying down or sitting on a rug” in firstdocument 1 was structured under its own heading, namely “Objects of theinvention” (104). However, the sentence “This enables a person to sit inan intermediate position between standing, and lying down or sitting ona rug” in second document 2 did not have its own heading, but was asecond paragraph under the “Summary” header 206. Because first document1 has been chosen by the user to dominate the structure, the sentence“This enables a person to sit in an intermediate position betweenstanding, and lying down or sitting on a rug” in second document 2 iscoerced to take on a header to match the first document 1 structure (thestructure of the subservient document is coerced to that of the dominantdocument), and this header is implicitly drawn from the “Summary” header206 under which this sentence falls in second document 2.

In FIG. 4, there is next a juxtaposition of the first document header“Summary” (02) astride a replication of the second document header“Summary” (206). This is because the computerized system has mapped thefull paragraph 107 under the “Summary” header 108 in first document 1 tothe first full paragraph under “Summary” header 206 in second document2, and has determined that these are highly similar. The differences arethat the word “posterior” in first document 1 is replaced by the word“buttocks” in second document 2 (2 words), and that second document 2has added the new sentence “Optional arm rests provide a locale for theuser to rest his or her arms.” (15 words) But the similarities are in atotal of 53 words in two segments that are identical from one documentto the other, yielding in the exemplary counting method used earlier 106similarities and 17 differences, or a 106/123=86.2% similarity.(Incidentally, the similarity measures can be and preferably are morecomplex than just the simplified word counting used for illustrationhere. For instance, the fact that the 53 matching words occur in onlytwo segments further strengthens the match, as opposed to, for example,if these 53 words were to be randomly scrambled into a large number ofsegments.) If a thesaurus (or some means of detecting synonyms) isduring this mapping of the two documents, then the word “posterior” isconsidered similar to, not different from, “buttocks,” and thesimilarity measure is enhanced even further, especially since the twosegments separated by “posterior” and “buttocks” can be considered as asingle segment with no differences. This is another instance in which aliteral difference in one respect may in fact be regarded as asimilarity in a different respect. (Note that the use of “segment” isconsidered to cover a document region spanning one or more words orpunctuation marks:)

Next we come to the first document header “Drawings” 106 which isjuxtaposed to the second document header “FIG.” 204. Without a thesaurusshowing synonyms, the underlying computerized system would have no wayof knowing that these two headers should be mapped together. But, bymapping the text subsections 105 and 203 under these respective headers,and ascertaining that each of these headers is associated with theidentical text “FIG. 1 shows a chair according to the preferredembodiment of the invention,” the computerized system deduces that theheader “FIG.” 204 in the subservient document maps to the header“Drawings” 106 in the dominant document, and coerces a downwardresequencing of the header “FIG.” 204 from the subservient document tomatch the original sequence of the dominant document. With a thesaurusto match “Drawings” with “Figure,” these section are mapped to oneanother with 100% similarity. Even without the thesaurus, and even if“Drawings” is regarded as different from “Figure,” the mapping of 4words per document is 13 similar words per document and 1 different wordper document (26/28=92.9%) if the headers are considered in the mapping,and 100%=26/26 of the headings are discarded and the mapping is basedonly on the subsection text and not the associated headers. In allevents, the similarity detected by the mapping is significant enough tocoerce the “Figure” header 204 in the subservient document down to the“Drawings” header 106 in the dominant document.

If the user at this juncture decides to use the computerized inputdevice, e.g., mouse to select the “+” expansion button 402 next to theheading “Detailed description,” the output expands from that of FIG. 4to FIG. 5. Again, the heading outputs of both documents expandsimultaneously in response to this single selection. By virtue of thenumbering i), ii) and iii) and/or underlining in the “Detaileddescription” subsection of first document 1, the computerized systemidentifies the sentences associated with this numbering as defining asecond-level hierarchical structure. (Of course, other identifiersapparent to someone of ordinary skill besides numbering or underliningcan be used to identify subsections within the scope of this disclosureand its associated claims.) During the mapping to the “Detaileddescription of the invention” section of second document 2, thecomputerized system finds a high degree of similarity. However, insecond document 2, there is no explicit substructure, and the orderingof the “seat” and “legs” discussion is reversed.

Because first document 1 dominates, the “Detailed description of theinvention” in second document 2 is coerced to take on a structureparallel to that of first document 1, and it is also coerced into beingresequenced to match the sequence of first document 1. Thus, the firstword of each of the sentences from second document 2 are coerced intobeing implicit headers for these sections (one can find other ways toestablish implicit headers; picking the first word, or a leading phrase,is just an example), and are also coerced into resequencing to match thefirst (dominant) document 1. Thus, as shown in FIG. 5, “i) Seat:” isjuxtaposed to “Second;” “ii) Legs” is juxtaposed to “First;” and “iii)Back:” is juxtaposed to “Finally.”

It will now be observed that in second document 2, a new feature isadded to the chair, namely, an armrest. This is described in the finalsentence of the detailed description that begins with “Optionally,” andis coerced into being yet another subsection of second document 2 inorder to match with the first document structure. However, there is nocounterpart to this section in first document 1. Because this seconddocument subsection is determined to have no mapping with any firstdocument subsection, a null subsection header 503 is coerced into firstdocument 1 (that is, first document 1 is “self-coerced” to take on anextended structure to add a fourth “null” heading to extend thestructure it established with the i), ii) and iii) numbering), and isjuxtaposed in the output against the second document subsection 41 forwhich this no mapping is determined. The null header 503 “(no match)” isoutputted for example here, but any of a limitless variety of phrases oroutputs (or even a blank space) can be used to indicate such a nullcorrespondence.

Now, let us suppose that starting from the output in FIG. 5, the userselects the “−” contraction button 501 next to “Detailed description” tocause a simultaneous contraction of each document, and then selects the“+” expansion button 502 next to claims. Then, the user selects anyfurther expansion buttons showing up under the claims to fully expandall of the claims. The resultant output will then be that of FIG. 6.While there are minor differences, it is clear that claims 1 and 2 ineach document are sufficiently similar such that they will map to oneanother, but that claims 3 and 4 in second document 2 have nocounterpart in first document 1.

In this case, the claim numbers, as well as the leading text in eachclaim, are used to define the subsection headers outputted in FIG. 6.However, without some understanding of claiming, each claim would beequally indented, and there would be no hierarchy associated with theseclaims based solely on their headers. First document 1 is “self-coerced”to extend the structure established by its claims 1 and 2, thusproviding a null mapping 601 to claims 3 and 4 of second document 2.

However, if the invention disclosed herein is to be used with certainspecialized types of documents that have certain specialized structures,such as but not limited to patent documents, then the programming thatdrives the underlying computerized system can be implemented torecognize certain aspects of these documents besides their explicitheaders, in order to derive a structure for these documents. Thus, forpatent claims, since claims 2 and 3 are dependent on claim 1, they areone level down in the structural hierarchy from claim 1, and since claim4 depends on claim 2, it is one level down in the structural hierarchyfrom claim 2. Methods for deducing this structure from the claim textshould be obvious to someone of ordinary programming skill.

They key point is that the determination of document structure derivesexplicitly and/or implicitly from various factors. Of course, anexplicit header in front of a subsection provides explicit structure,especially if it is numbered/lettered in accordance with a “table ofcontents” schema that implies a certain hierarchical structure, or ifthe documents comprise an expressly-represented table of contents. But,the structure can be implicitly derived in a wide variety of waysapparent to someone of ordinary skill within the scope of thisdisclosure and its associated claims, based on some understanding of theparticular types of documents that are intended to be compared.

In the foregoing discussion, first document 1 was selected by the userto be dominant, and second document 2 was selected by the user to besubservient. FIG. 7 now begins to illustrate the reversed situation, inwhich second document 2 is selected at the dominant document and firstdocument 1 is selected to be subservient. Thus, in FIG. 7, the title ofsecond document 2 is outputted on the left and that of first document 1s outputted on the right. There are of course many ways obvious tosomeone of ordinary skill in which this selection of which document isdominant can occur, and how this selection is made in detail is muchless pertinent than the fact that this selection can be made. In onesuch method, for example, the user might have outputted a list ofdocuments, and would first point and click on the document which is tobe dominant, and would second point and click on the document which isto be subservient. If three or more documents are to be compared, one isselected to be dominant, and all others are selected to be subservientand will thus be coerced to the dominant document.

Starting from FIG. 7, a mouse point and click on the “+” expansionbutton 701 next to the titles again—simultaneously expands the output ofthe headers of the two documents which have been mapped to one anotherby the computerized system, as shown in FIG. 8. But here, seconddocument 2 dominates the sequence and the structure. Hence, since the“Figure” heading 204 heads the second subsection of the dominantdocument, it is the second subsection header shown in the left-handcolumn, and the “Drawings” header 106, which is originally sequenced asthe fourth section of the subservient document, is coerced to beresequenced up to the second header position in the right-hand column.“Summary” matches up to “Summary,” but with one difference from thesituation of FIGS. 3 through 6. Because second document 2 now dominatesthe structure, the fact that the sentence “It is therefore an object ofthe invention to provide a device for a person to sit in an intermediateposition between standing, and lying down or sitting on a rug” has itsown heading in the original structuring of the first (now subservient)document 1, its counterpart phrase “This enables a person to sit in anintermediate position between standing, and lying down or sitting on arug” in the second (now dominant) document 2 does not have its ownheader, but is subsumed under the “Summary” header. Thus, since theoriginal structure of second document 2 now dominates, the “Objects ofthe invention” header 104 is coerced entirely out of the output in FIG.8, and does not appear anywhere.

Similarly, since “Detailed description of the invention” in the second(now dominant) document 2 does not contain the second-level subsectionsdenoted by roman numerals i), ii) and iii) of first document 1, firstdocument 1 is coerced to lose this second-level substructure, and allthat is outputted is the header “Detailed description of the invention”juxtaposed to “Detailed description,” but without any “+” expansionbutton leading to any further substructure.

The phrase “I claim”: maps to “Claims,” and the abstracts map to oneanother.

If the user now points and clicks on the “+” expansion button 801 nextto “I claim:,” and further clicks on all expansion buttons within theclaims, the claims are fully expanded as shown in FIG. 9. Claims 1 and 2map from one document to the other, while claims 3 and 4 of seconddocument 2, which have no counterparts in first document 1, areoutputted in juxtaposition to a null, e.g., “no match” header 901.

At this point, enough examples have been provided so that it becomespossible to talk about the invention in more general terms. In thebroadest terms, there are two main processes that take place, namely,“mapping,” followed by “comparison.” We discuss this in further detailbelow.

First, “subsections” are identified for each document to be compared.Two (or possibly more) documents are mapped to one another to establishand output correspondences between their various identified subsectionsbased on identifying substantial similarities between (or among) thesedocument subsections. These correspondences are then outputted asalready discussed in connection with FIGS. 3 through 9. If more than twodocuments are compared at a single time, the additional documents arejuxtaposed just as in FIGS. 3 through 9, but with an additional column(or similar juxtaposition structure) for each of the third andsubsequent documents. One of these documents is still selected as thedominant document for the purpose of coercing the sequencing andstructure of the other documents.

For the simplest case of two documents, once these “subsections” of eachdocument are suitably identified by the computerized system, allsubsections of first document 1 is combinatorially compared against allsubsections of second document 2, and a similarity measure (such as butnot limited to the “word counting” similarity measure described earlieras an example of similarity measure) is established for each section offirst document 1 in relation to every other section of second document2. (For more than two documents, mapping is carried out for eachdocument in reference to the dominant document.) Of course, thesimilarity measure will be close to zero for some section mapcombinations, and close to 1 (or, e.g., 100%) for other section mapcombinations. The strongest similarity measures establish the mapping.Preferably, as discussed earlier, a thesaurus is used to map synonyms toone another and stemming is used to map matching root words and phrasesto one another, such that these are all regarded as similar, notdifferent, for the purpose of this mapping. (Later on, when thesubsections themselves are compared, these will be shown asdifferences.)

The results of this mapping are used to determine which subsections ofsecond document 2 (and third, fourth, etc., if more than two documentare compared) map most closely to which sections of first document 1.The documents other than the dominant document are then coerced intoresequencing and restructuring based on how they map to the dominantdocument, and the document headers associated with the various documentsubsections are then outputted in the manner described in connectionwith FIGS. 3 through 9. For two document comparison, each pair ofheaders that is shown in juxtaposition on the output comprises a “headerpair.” For three documents, the mapped headers comprise header triplets,etc.

Second, again for the simplest case of two documents, after the resultsof this mapping are outputted as in FIGS. 3 through 9, the user then hasthe ability to select one or more of the header pairs from these mappingoutputs, and based on this selection, to simultaneously output firstdocument 1 subsections and second document 2 subsections associated withthis header pair selection, juxtaposed to visually emphasize thecomparison of these subsections. FIGS. 10, 11 and 13 to be discussedshortly, are used to illustrate the comparison that occurs after themapping is complete.

One will note the presence of a small circle in the middle of the twocolumns in FIGS. 3 through 9. This small circle (or similar device)provides a means for comparison selecting a particular header pair inorder to output a detailed comparison of the subsections associated withthe selected header pair.

For example, starting from any of the outputs of FIGS. 4, 5, or 6, andthus knowing that the computerized system has mapped first documentsubsection 101 titled with the header “Background of the invention” 102to second document subsection 201 titled with the header “Background”202, suppose the user would now like to see, in detail, how these twosubsections 101 and 201 compare to one another. To obtain such acomparison, the user uses the computerized input device to select theheader pair “Background of the invention” and “Background,” for example,by pointing and clicking on the comparison selection means 410 (e.g.,the small circle) between (or juxtaposed in some suitable configurationrelative to) this header pair.

The result of this selection on comparison selection means 410 in any ofFIGS. 4, 5, or 6 is shown in FIG. 10. In FIG. 10, the text of theselected subsection 101 of first document 1 (and optionally its header102) is outputted in the left-hand column, and the text of the selectedsubsection 201 of second document 2 (and optionally its header 202) isjuxtaposed relative thereto in the right hand column. This is asimultaneous output of the subsections from both the first and seconddocument in response to a single point and click on the comparisonselection means 410—associated with this subsection header pair.Optional subsection context information is also outputted toward the topof the screen, showing where this all fits in the context of the overalldocument structure. Let us examine this in more detail.

First, turning to the detailed text of the selected subsections 101 and201, FIG. 10 shows difference highlighting—1002 in the form ofrectangles (and vertical lines) that are drawn around (or proximate)each segment of text that differs from its counterpart in the otherdocument. Thus, the segments discussed earlier that are different fromone another are highlighted in this way. Thus, the differencehighlighting of “need” in first document 1 and of “become tired andwould like” in second document 2 visually emphasizes to the user, thefact that “need” is modified (replaced) by “become tired and would like”when moving from the first to second document 2. The differencehighlighting of “Beds enable” and “A bed enables” visually emphasizesthat “Beds enable” is replaced by “A bed enables” when moving from thefirst to second document 2. The optional “addition marker” e.g., avertical line or similar device to the right of “not” in first document1 and the difference highlighting of “to” in second document 2 visuallyemphasizes that there is something added to second document 2 at thelocation in first document 1 marked by the addition marker, and thatthat addition is the word “to” in the highlighted spot in seconddocument 2. Finally, the difference highlighting shows the replacementof “a halfway” with “an intermediate” when moving from the first to thesecond document.

The comparison of “Beds enable” and “A bed enables” is of furtherinterest because, as discussed earlier, during the mapping phase, thesetwo phrases, using stemming, were regarded as equivalent. But, in thecomparison stage, the differences in these phrases are illustrated andemphasized. That is to say, for the purpose of mapping together the twodocuments to determine which subsections map to one another and are thismost appropriately compared, it is desirable to declare a “similarity”rather than a “difference” when one root phrase is simply replaced by asimilar root phrase that is simply stemmed (differently e.g., truncated,re-tensed, pluralized or singularized, capitalized or lower-cased,differently punctuated) or when one word or phrase is replaced by asynonymous word or phrase, since this will facilitate more accuratemapping. Conversely, when comparing text to text as in FIG. 10, then itis desirable to highlight all differences, so that the user can easilydetermine that one root phrase has been replaced by a literallydifferent, though semantically similar root phrase, (stemmed) or thatone word or phrase has been replaced by a synonymous word or phrase.

Turning now to the headers 102 and 202 just above the subsections 101and 201, the differences in the headers are also highlighted in exactlythe same manner as differences in the subsections. Here, the optional“deletion marker” e.g., a vertical line or similar device to the rightof “Background” in second document 2 coupled with the differencehighlighting of the phrase “of the invention” in first document 1visually emphasizes to the user that the phrase “of the invention” isdeleted when moving from the first to the second document, and showswhere it is removed from.

Also shown (in this case by its absence) is similarity highlightingdifferent from the difference highlighting, showing which phrases areunchanged moving from first document 1 to second document 2. In thisexample, the fact that no highlighting at all appears on the similarphrases in FIG. 10 serves to advise the user that these phrases areunchanged. That is, in FIG. 10, only differences are highlighted, andunhighlighted text is regarded as similar (unchanged).

Note that although similarities and differences are illustrated fordocument segments, they are established on a—word-by-word basis. This isin contrast to comparing documents, e.g., line-by-line and determiningthat two lines are different from one another if any words in the lineare different, which would grossly overstate differences and understatesimilarities.

Obvious variations in this difference/similarity highlighting schemewill be apparent to someone of ordinary skill, and it is to beunderstood that all such variations are considered to be within thescope of this disclosure and its associated claims. For example, in theimplementation illustrated here, the similarities are in fact identifiedby their absence of highlighting, i.e., by their being outputted asordinary text. Then, when the cursor is pointed over one of thesesimilar document segments (even without a click) that segment, as wellas its counterpart segment in the other document, will be additionally,expressly highlighted for similarity. Thus, as shown in FIG. 11, whenthe user moves the input device, e.g., cursor pointer 1101 controlled bya mouse over the segment “them to lie down, but not” in first document 1(or alternatively, in second document 2), this segment receivesadditional similarity highlighting in the form of an underline (or anyother suitable highlighting) that appears with respect to the appearanceof this segment in both the first and second documents. That is, asingle point simultaneously highlights similar phrases in bothdocuments. In this manner, the user can effectively ask what phrase insecond document 2 is identical to a selected (pointed to) phrase infirst document 1 (or vice versa), by pointing to the phrase in firstdocument 1 and having that phrase as well as its second documentcounterpart receive additional difference highlighting (or vice versa).This will be referred to herein as “active highlighting,” since it isactively responsive to the use input device in the manner described.

More generally, any two (or more) text segments that are outputted on acomputer output device can be actively highlighted in this way, even twotext segments that are within a single document. A first selected textsegment (e.g., the segment “them to lie down, but not” in the left handcolumn of FIG. 11) is similarity highlighted in response to selectingthat first selected text segment using a computerized input device.Simultaneously, this causes other text segments similar to the firstselected text segment (e.g., “them to lie down, but not” in the righthand column of FIG. 11) to also be similarity highlighted in response toselecting the first selected text segment using the computerized inputdevice.

This active highlighting (“additional similarity highlighting”) has verygeneral utility in a wide range of situations. For example, if a firsttext reads “John gave Mary a kiss” and a second reads “Mary gave John akiss,” placing the cursor over the one-word segment “John” in eithertext highlights “John” in both texts. Placing the cursor over theone-word segment “gave” in either text highlights “gave” in both texts.Placing the cursor over the one-word segment “Mary” in either texthighlights “Mary” in both texts. Finally, placing the cursor over thetwo-word segment “a kiss” highlights “a kiss” in both texts. As such,this provides a way for a user to ask where a particular word or phrasein the first document shows up in the second document. While “coercion,”as discussed above, is used to display subsection moves, this activehighlighting is one good way of ascertaining text moves within asubsection. If some text that is pointed to in a subsection of onedocument is located elsewhere (is moved relative thereto) in the samesubsection of another document, the appearance of the activehighlighting on the pertinent section the other document enables theuser to clearly deduce that a move has been made, and to know exactlyhow that move has been made.

Suppose further, for example, that a user wants to see multipleoccurrences of a text such as the term “Force Majeure” in a legalcontract. By placing the cursor over this term in one place, the usercan cause this term to be highlighted in all other places as well. It isto be understood, that this can occur across two or more documents, butcan also occur within a single document. So, the same result describedabove would apply if one document read, in its entirety, “John gave Marya kiss, and then Mary gave John a kiss.” Cursor placed over any “John”highlights all “Johns,” over any “Mary” highlights all “Marys,” etc. Ifthe cursor is placed over the term “Force Majeure” in what happened tobe the opening definitions of a legal contract, all occurrences of thatterm are highlighted. As such, this active highlighting function that isused for simple comparison in one situation (e.g., that of FIG. 11) isthe same tool that is used if one wishes to examine text movement withina given subsection, and is also the basic tool used to examine howcertain terms are utilized and defined in a single document or acrossmultiple documents.

The overall highlighting scheme chosen for a given implementation ofthis invention can take the form of what is shown as an example here(i.e., rectangles around differences, nothing for similarities, andunderlines for actively selected similarities), or any other formobvious to someone of ordinary skill. Thus, such highlighting schemescan involve outputting different fonts, different colors, different wordcases, different emphasis (e.g., bold, italic, underlines, superscripts,subscripts, flashing text, etc.), and any other sort of outputdifferences, in any and all combinations with one another, all withinthe context of this disclosure and its associated claims. All that isrequired is that differences be highlighted differently thansimilarities, no mater what highlighting scheme is chosen. The use of“addition” markers and “deletion” markers such as discussed above, ispreferred, but not required. Also, it is contemplated within the scopeof this disclosure and its associated claims that a user can optionallychoose from a broad range of highlighting schemes by setting his or her“highlighting preferences” accordingly.

Now we turn to the optional context information 1002 and 1004 in theupper two lines of FIG. 10. Underlining is used here, as an example, nota limitation, to designate that this is context information, as opposedto comparison information. This further outputs to the user whichsubsection headers these outputted subsections are part of, as well asthe context within the overall document structure. Thus, the user cantell from FIG. 10 that the material highlighted for differences andsimilarities from first document 1, is part of a document titled“CHAIR,” and is within a first level subsection titled “Background ofthe invention,” and that the material highlighted for differences andsimilarities from second document 2, is part of a document titled“IMPROVED SEATING DEVICE,” and is within a first level subsection titled“Background.” Again, this is not comparison information, this isinformation to establish the document structural context within whichthe outputted comparison is taking place. Thus, the output of FIG. 10contains two types of information: information juxtaposing the output ofthe selected document subsections relative to one another to visuallyemphasize a comparison (similarities and changes) therebetween; andoptional context information outputted in conjunction with theassociated subsections. Note, therefore, that the headers themselves maybe repeated: once to visually emphasize the comparison between headersfrom one document to the next (102 and 202); and once to provide contextinformation (1002 and 1004).

Let us now suppose that the user wishes to simply generate a“substitution list” that shows the changes that have occurred betweenthe two documents under consideration (or among three or more documentsin the general case), for either the entirety, or selected subsections,of these documents. Let us suppose, for example, that the user wants toview a substitution list of the changes found comparing the subsection“CHAIR /Background of the invention” to “IMPROVED SEATINGDEVICE/Background,” that is, the subsections illustrated in FIGS. 10 and11. Such as substitution list is illustrated by FIG. 12. While there aremany options for arriving at the substitution list of FIG. 12 that wouldbe apparent to someone of ordinary skill, all of which are contemplatedwithin the scope of this disclosure and its associated claims, wediscuss below an exemplary process of how this might be done.

As an illustrative example of how to do this, the user goes back to theoutput of any of FIGS. 4, 5 or 6 (any mapping screen that shows theheaders of the two desired subsections juxtaposed to one another) andselects the comparison selection means 410 juxtaposed to “Background ofthe invention” and “Background,” though in a different manner than wasused earlier to jump to the output of FIG. 10. For example, notlimitation, if the user from FIGS. 4, 5 or 6 used a “left click” of amouse on comparison selection means 410 to get to the output of FIG. 10,the user might use a “right click” to jump instead to the output of FIG.12. Or, a right click may cause a menu to be outputted in the mannerthat is customary in the art for right clicks, and then the user wouldselect an option from that menu to bring up the output of FIG. 12.Again, many similar navigational alternatives will be apparent and areconsidered to fall within the scope of this disclosure and itsassociated claims.

Alternatively, working from the output of FIG. 10, which also has acomparison selection means 410 button illustrated astride the headersthat provide context information, the user similarly uses thecomputerized input device to select the comparison selection means 410(e.g., circle) juxtaposed to “Background of the invention” and“Background” according to some suitable navigational scheme (left click,right click, menus, etc.), which causes a navigational jump to theoutput of FIG. 12.

Once the user arrives at the screen of FIG. 12, irrespective of thenavigational scheme used to get to this output, the user sees asubstitution list 1200. In this list, which illustrates one example ofhow this substitution information may be provided, every addition,replacement, and deletion as between the selected subsections issummarized, along with an optional numeric tally showing the frequencyof each said addition, replacement, and deletion. Note, in FIG. 12, allof the changes occur one time each. However, it is to be expected thatfor most documents, certain changes will occur more than once, and thatthis numeric tally will be an important item of information to the userreviewing the documents. (It is to be observed that a replacement is infact a combination of a deletion from first document 1 and an additionto second document 2 in the same location. Thus, the replacementsummarized by, e.g., the phrase “a halfway was replaced by anintermediate” could also be communicated with a phrase such as “ahalfway was deleted; and intermediate was added,” that is, by indicatingonly deletions and additions.)

To the left of each. substitution item of the substitution list is anoptional substitution item priority selector means—1202 (the illustratedboxes), to be employed as follows. First, suppose the user is reviewingthe two documents to glean from the comparison only the changes that areimportant to the user. Minor changes are to be ignored, but importantchanges are to be selected to further emphasis and/or review. So, uponreviewing the substitution list of FIG. 12, the user decides that theonly changes of importance are the first two listed changes, and thatall others can be ignored. Based on this, the user selects (checks off)each box of priority interest, for example, by a mouse point and click.Thereafter, by virtue of this selection, and until the user instructsotherwise, all other (non-selected) changes will be ignored in thevarious screen outputs, and only the selected priority changes will behighlighted. This will be referred to as “priority highlighting.” Thus,using FIG. 10 as an example, once the selections shown in FIG. 12 aremade, the only difference highlighting appearing on a screen such asthat of FIG. 10 would be that related to the selections in FIG. 12. Allother difference highlighting would cease to appear, or would appear ina different highlighting to indicate that the highlighted matterdesignates a difference between the documents that the user believes canbe ignored. (As earlier, user preferences can be used to establishexactly how this highlighting is to occur.) Also, a method for changingthe selections made in FIG. 12 by engaging in certain selections at theoutput of FIG. 10 might be implemented. For example, if the user changeshis or her mind and decided to add (check off) or remove (uncheck)another substitution to the list of FIG. 12, the user might right clickon the text of interest in FIG. 10, and, in response to a menu choice,add or remove that text from the checklist of FIG. 12. Again, it is tobe understood that the foregoing navigational scheme is just an exampleand that someone of ordinary skill can readily find other alternativeswithin the scope of this disclosure and its associated claims.

Suppose now that the user is at the screen of FIG. 12, but wishes to “goto” all occurrences of one of the changes listed in FIG. 12 and see theactual change in the context of the FIG. 10 output. For example, theuser wishes to see where “need” was replaced by “become tired and wouldlike.” As an example of how to navigate this, it is observed that thenumeric tally is underlined, in this case, so as to designate a“hyperlink. ” Thus, pointing and clicking (he “1 time:” hyperlinkastride the “need was replaced by become tired and would like,” causes ajump to the screen of FIG. 10. When the user arrives at the screen ofFIG. 10, the words “need” in the first document 1 output and “becometired and would like” in the second document 2 output appear in adifferent highlighting (for example, flashing on and off, or extra bold,etc.) so as to show the user the context within which the changeselected from FIG. 12 occurs. Other substitutions may be leftun-highlighted, or highlighted with lesser emphasis relative to theselected substitution. Once again, optional user preferences can be usedto specify the highlighting scheme. This will all be referred to as“go-to” output highlighting. If there are multiple occurrences (2 ormore times) of a particular selected substitution, the resulting screenshows all occurrences of the selected substitution. Additionally, theuser is provided options such as “go to next” and “go to previous” thatare known in the art for single documents, but which does not appear tobe known for simultaneously jumping through 2 or more documents inparallel.

The subsection comparison illustrated in FIG. 10 was for a singlesubsection at the lowest (bottom level) node in the documenthierarchical structure. That is, from the output of FIG. 4, 5 or 6, theuser selected the comparison selection means 410 button juxtaposed tothe “Background of the invention” and “Background headers, and there wasno “+” or “−” expansion or contraction button juxtaposed to theseheaders to indicate that these sections have any further substructure.

Let us suppose now that the user wishes to examine changes at a higherlevel document structure, i.e., at a level that is not a bottom-levelmode. For example, suppose the user wishes to examine changes in firstdocument 1 sub-section “Detailed description” which is juxtaposed to thesecond document subsection “Detailed description of the invention.”These sections, as indicated by the “+” and “−” buttons in FIGS. 4, 5and 6, do indeed have substructure, and thus are not bottom level nodes.(Recall by the way that here, this substructure originally exists onlyin first document 1, and is coerced into second document 2. But in manysituations, the substructure will originally exist in both documents.)In this event, similarly to what was discussed before, the user simplyselects the comparison selection means 412 (e.g., the small circle)astride these headers, resulting in the output of FIG. 13. In FIG. 13,the output compares the entire detailed description section from bothdocuments, over all of the substructure in these “Detailed description”and “Detailed description of the invention” subsections.

FIG. 13, like FIG. 10, shows both context and comparison information.(Recall that the context information, for example, not limitation, isunderlined.) This output is similar to that of FIG. 10 and is to beinterpreted in the same way. What is different is that severalsubsections are outputted, since the user, by selecting the “Detaileddescription” and “Detailed description of the invention” subsections,selected to see all of the lower-level subsections within these selectedsubsections. Context information is provided in conjunction with eachlower level subsection, and a means for jumping to a substitution listlike that of FIG. 12 is also provided via the comparison selection means(e.g., small circles).

From here, if the user wants to generate a substitution listencompassing all of the material outputted in FIG. 13, the user would(again, simply as a non-limiting example) point and click on thecomparison selection means 412 juxtaposed to the “Detailed description”and “Detailed description of the invention” headers, to bring up asubstitution list like that of FIG. 12, but for all of the text shown inFIG. 13. If the user wanted to view the substitution list for just onesubsection (e.g., for the subsection associated with the “ii) Legs:” and“First” headers, the user would (for example) point and click on thecomparison selection means 1402 juxtaposed to these headers, and wouldget a narrower substitution list for just the selected sections.Pointing and clicking on the numeric tally hyperlink for a givensubstitution would show all occurrences of the selected substitution,within all of the subsections that were originally selected for outputin FIG. 13. A broad range of navigational options, such as discussedearlier, can all be employed within the scope of this disclosure and itsassociated claims.

If, on the other hand, the user wants to see a comparison of theentirety of both documents, the user would go to the output of any ofFIGS. 3, 4, 5, or 6, and select the comparison selection means 304astride the header pair “CHAIR” and “IMPROVED SEATING DEVICE”, and theresult would be an output similar to FIG. 13, but for the entirety ofboth documents. (It is understood that long documents will not output ona single screen, and that some form of known-or-obvious-in-the-art meansfor, e.g., “continuous scrolling” or “discrete paging” would beemployed, but that this would scroll or page through two or moredocuments simultaneously, which does not appear to be known. The use ofthe term “scrolling” herein is to be broadly interpreted, to encompassall known forms of moving serially through a document from front to backor from back to front including discreet paging and continuousscrolling, as well as jumping to a specified location such as “going to”a specified page or searching for a key word or phrase.) Similarly, tooutput a substitution list analogous to FIG. 12, but for the entirety ofboth documents, the user would suitably select, for example, any of theselector circles juxtaposed with the “CHAIR” and “IMPROVED SEATINGDEVICE” headers on any of the screen outputs of FIGS. 3, 4, 5 or 6. Fromsuch a “global” substitution list, the user can select the hyperlink forany given substitution, and receive an output showing all occurrences ofthe selected substitution throughout the entire pair of documents.

The point is that for a structured document, the user can select tooutput a comparison or a substitution list of any level of the document,whether that be for a single subsection at the bottom of the documentstructure, or for the entire document at the very top node of thedocument structure, or at any level in between, and can navigate aroundamong these various options at whatever structural level has beenselected.

Another feature of interest is the use of comparison summaries andsimilarity indicators. In many instances, the user may desirestatistical summary information that enables the user at a glance to seehow similar or how different various document sections are relative toone another.

As noted earlier, the process of “mapping” is based on taking certainsimilarity measures as between the various document subsections. Theword counting/percentage of similar words schema outlined above (incombination with stemming, thesaurus use, length of the word strings inmatching segments, etc.) is a simple version of such similaritymeasures, but it is understood that a wide range of statisticaltechniques for assessing similarity and differences may be employedwithin the scope of this disclosure and its associated claims. Forexample, the references: Uri Zernik (editor), “Lexical Acquisition:Exploiting On-Line Resources to Build a Lexicon”, Lawrence ErlbaumAssoc, Publishers, Hillsdale, N.J., 1991; Gerard Salton, “Automatic TextProcessing: The Transformation, Analysis and Retrieval of Information byComputer”, Addison Wesley, Reading Mass., 1989; and

Christopher D. Manning and Hinrich Schutze, “Foundations of StatisticalNatural Language Processing”, The MIT Press, Cambridge, Mass., 1999; allpresent a range of techniques for similarity measurement that would besuitable for use in connection with the invention disclosed herein. Itis to be understood that the use of similarity measures such as thosediscussed in the references, in combination with the other disclosed andclaimed aspects of applicant's invention, is considered to be within thescope of this disclosure and its associated claims.

Regardless of the method by which these similarity measures are taken(and even if several different types of similarity measure aregenerated), these similarity measures are of interest to the user, andmay be outputted in juxtaposition with the headers of the subsections towhich they apply. In all of FIGS. 3 through 11 and 13, comparisonselection means (e.g., a series of small circles) are illustratedbetween (or juxtaposed in some suitable configuration relative to) thevarious header pairs shown on each output. As discussed at length above,this comparison selection means (which is really just a “button” onwhich a mouse can be pointed and clicked) has been used to initiatecertain outputs, such the detailed document comparisons of FIGS. 10, 11and 13, or the substitution list of FIG. 12. However, in a preferredimplementation, the comparison selection means further comprises asimilarity measure. That is, rather than a simple circle or similarbutton which provides no information to the user but is just used topoint and click, the comparison selection means doubles to provideinformation about the similarities and differences between theirassociated subsections. Alternatively, the similarity measure may beindicated distinctly from the comparison selection means.

Thus, for example, if one were using the simple similarity measurediscussed above wherein the subsections outputted in FIG. 10 are 82.2%similar, comparison selection means 410 as it appears in FIGS. 4, 5, 6,10 and 11 would comprise this number, 82.2%, or a similar indicator(e.g., 0.822, a pie chart with 82.2% of the pie filled in, etc.), in theform of a hyperlink. Thus, by pointing and clicking on this indicator(whether by left click, right click with menu, or similar means), onejumps to various types of output as outlined above. But, even without apoint and click, comparison selection means provides useful informationin the form of outputting one or more similarity measures. In thisscheme, the circles in FIGS. 3-11 and 13 would instead be a series ofpercentages, or pie charts, or fraction, or some othersimilarity/difference measurement, to tell the use at a glance howsimilar and/or how different the two or more juxtaposed sectionsactually are. Thus, a user looking at the output of FIG. 6, for example,would see 12 different percentages, one for each header pair, showinghow similar the associated subsections are. Then, the user can use theseas a basis for prioritizing which subsections to “drill down” intofirst, bases on the information provided in this similarity measure.

Again, it is understood that a broad range of similarity measures can betaken and outputted in this or a similar manner, all within the scope ofthis disclosure and its associated claims. Similarly, if there areseveral similarity measures available from a particular implementation,the user can establish preferences about which of these similaritymeasures are to be outputted. In some cases, where the user's interestis “differences,” the user might select a similarity measure that isparticularly indicative of the difference between the documents. Inother situations, if “similarity” is the main interest, then thesimilarity measure preferred for output would be one that emphasizessimilarity.

A good example of the helpfulness of this type of this statisticalsummary information is to consider, for example, a copyright attorneywho is looking to ascertain the degree to which a client's document(first document) has been “copied” or “plagiarized” by apossibly-infringing (second) document. The mapping method of theinvention, the results of which are outputted in FIGS. 3 through 9,helps to uncover any reordering and restructuring that may have takenplace in the infringing document to “hide” the copying. The furthernavigation to comparison outputs such as shown in FIGS. 10 through 13provides further useful information to assess copying, in detail.However, all of this, coupled with the output of various summarysimilarity measures on a section-by-section basis, as well asdocument-wide and at all levels of substructure, provides a trulycomprehensive picture of whether, where, and to what degree, firstdocument 1 has been copied by second document 2.

Until this point, we have examined mapping and comparison involving twoor more documents, that is, inter-document mapping and comparison.However, there may also be circumstances in which the user will wish tocompare two different subsections within the same document, that is,where the use desires to perform an intra-document comparison. FIG. 14illustrates an intra-document variation of the invention.

As has been the case all along, there are many navigational options forreaching the screen output of FIG. 41, and we shall discuss oneexemplary option here. Starting from a document outline that is expandedto show two different subsections to be compared, both within the samedocument, such as the right-side column in FIG. 6, the user selects twosubsections to be compared with one another, using a range of meansknown in the art that are suitable for such a purpose. For example, ifthe user wanted to compare claim 3 with claim 4, the user would firstselect claim 3, and then select claim 4, for example, by right or leftclicking on each, and possibly selecting from a menu that provides an“intra-document comparison” option. That would bring up a screen outputsuch as that of FIG. 14 where these two claims are compared with oneanother. It is worth noting that the left-side column in FIG. 6 isirrelevant for this purpose and thus all that is really needed is theright-side column. In short, one can navigate to the output of FIG. 14from a conventional hierarchical output of just the one document forwhich the inter-document comparison is desired.

The output of FIG. 14 is similar to and carries with it the samefunctionality as that of FIGS. 10, 11 and 13, with optional contextinformation, and with difference highlighting 1002 emphasizing thedifferences between the two subsections selected for comparison.Features like active highlighting, substitution lists, etc., aresimilarly available. In this example, the only difference is thedependency of claim 3 on claim 1, and of claim 4 on claim 2. In thespecialized context of patent claiming, this identifies claims 3 and 4as effectively being a single claim multiply-dependent on claims 1 and2. Note also that claim 4 is indented with respect to claim 2 to denotethat it is part of the hierarchical structure under claim 2, but thatclaim 3 is indented with respect to claim 1, not claim 2, to denote thatit is part of the hierarchical structure under claim 1.

FIGS. 15 and 16 are actual screen shots taken from a prototype reductionto practice of the invention for a complex legal document with a greatdeal of substructure. FIG. 15 shows the “mapped” (table of contents)view of FIGS. 3-9, while FIG. 16 shows the comparison view of FIGS. 10,11 and 13. In FIG. 15, the reader will note the “coercion” in thesubservient document (right hand column) of the headers of the ArticleIII, Section 3.1 subsection to the structure of the dominant document(left hand column), as discussed above. Also appearing in the middlecolumn is a similarity measure. By pointing and clicking on thesimilarity measure between the Article II headers, and by then scrollingto the bottom of the screen, one arrives at the comparison display ofFIG. 16. It is to be observed how the “differences” are highlighteddifferently than the similarities.

Although this disclosure has discussed certain implementations ofmethods for navigating from one screen to the next, it is understoodthat many variations and substitutions for navigating from one type ofscreen output to the next will be apparent to someone of ordinary skill,and all that such variations are to be considered within the scope ofthis disclosure and its associated claims.

It is understood that although the preferred “juxtaposition” for thevarious comparisons is horizontal, other juxtapositions which serve tosuitably visually emphasize the mapping and comparisons being displayedare equally regarded to be within the scope of this disclosure and itsassociated claims.

It is important to point out that mapping and comparison functions andoutputs disclosed herein can readily be implemented in a stand alonedocument comparison system, without any text editing capability.Conversely, it is to be understood within the scope of this disclosureand its associated claims, that these functions and outputs can also beimplemented as part of a text editing and/or work sharing system inwhich a single user, or multiple users in a workgroup, can edit text inconjunction with making use of the mapping and comparison functionsdisclose herein.

When used as part of a text editing/work sharing system, the mapping andcomparison functions and outputs disclosed herein may be part and parcelof such a system, or may be a “plug-in” module that a user canseparately add to augment such a system.

It is also to be understood that in today's computing andtelecommunications environment, there are many ways to distribute thecomputer processing required to implement the mapping and comparisonfunctions and outputs disclosed herein. Of course, the entire inventionas herein disclosed may be embodied in software installed directly on anend user's workstation. Or, it can be used in a network environment in avirtually limitless range of configurations. For example, all or most ofthe software and hardware can reside on a server (or on multipleservers) that are remote from the end user. The end user may then uploaddocuments to the server(s) and instruct the server to carry out themapping and comparison of the uploaded documents. Then, the serversimply would transmit the necessary information to the end user'sworkstation thereby causing an output—albeit a remote output—of theresults of the mapping and comparison with suitable juxtaposition.Similarly, a server would act in response to a computerized inputdevice—albeit a remote input device—in order to perform various tasksassociated with the invention. It is to be clearly understood that thisdisclosure and its associated claims are understood and intended toapply to all such situations, irrespective of how the processing andoutput and input are distributed among one or more computerized devicesusing modem telecommunications and computing systems, or may reside on asingle computerized device without any communications connection to anyother computerized device. Of course, the division of functions asbetween hardware and software is similarly irrelevant to the applicationof this disclosure and its associated claims.

The underlying computer system itself, of course, comprises thenecessary computer storage, memory, and processor capability necessaryto store and access the underlying documents, perform necessarymatching, comparison and other operations involving these documents asdiscussed throughout this disclosure, and comprises the computerizedinput devices (e.g., mouse, keyboard, etc.) and output devices (e.g.,display screen) required to accept directions form the user and generatethe desired output. One may use a single general purpose computer, or asnoted above, a plurality of computers interconnected with one anotherwherein various storage, processing, input and output tasks are spiltamong more than one computer. Indeed, there are an endless variety ofways in which this device, system and method may be reduced to practiceusing computerized devices and methods well-known to those or ordinaryskill, all of which are regarded to be within the scope of thisdisclosure and its associated claims.

It is also important to understand the use of the term “headers” and“subsections” as those terms are used in this disclosure and theassociated claims, and as those terms are intended to be understood andinterpreted. Headers, as has been discussed at length herein, can beexplicit or implicit. The same is true for subsections. An explicitheader is one that is already part of the document to begin with such asthe headers associated with an explicit table of contents, or such astext that is numbered and/or set forth with certain highlightingcharacteristics or is somehow “tagged” as a header in the originaldocument text. The subsection associated with such an explicit header isan explicit subsection. An implicit header, on the other hand, is onethat is not necessarily in the document to begin with, but is eitherinferred from the document because the computerized system is programmedto make such an inference (e.g., a claim number is an implicit header inthe examples shown), or it is coerced into the document as a consequenceof mapping two or more documents together. This includes the “null”headers that are used as placeholders when two document subsections maptogether, but one of them does not start off with a suitable header, orone document contains a subsection for which the other document simplyhas no corresponding subsection. An implicit subsection is one that iscreated in the course of the mapping, in association with an implicitheader. For example, in FIG. 5, an implicit heading, namely, a second“Summary” heading, was created for the sentence “This enables a personto sit in an intermediate position between standing, and lying down orsitting on a rug.” This sentence, while not originally a subsection inits own right, was coerced into being an implicit subsection in order tofacilitate the mapping and output to the first (dominant) document.Thus, the use of the terms “headers” and “subsections” in the claimsshould not be interpreted as being restricted only to explicit headersand subsections, but also includes the various types of implicit headersand subsections illustrated herein.

It is also important to understand that when it is said in thisdisclosure and it associated claims that a series of document headersand/or a series of document subsections is “outputted,” that thematerial to be outputted may be too large to fit in its entirely ontothe window of the computerized output device, and that only a portion ofthis material will actually appear on the output screen at any one time.The user, as is conventional in the art, would then use scrolling orpaging or similar functions to move through the material on the screen.Thus, in no way should a statement that some text or header material is“outputted” be interpreted to require that the entirety of that materialmust be outputted so as to fit into a single output screen. Rather, itis to be understood to mean that some or all of that material is visibleon the output device, but that all of the material is available for“output” insofar as it can be scrolled to or paged to at will usingmethods well known in the art. It is also to be understood that “output”is to be broadly understood to encompass all known means forrepresenting information inside of a computer to a user, such as but notlimited to computer display screen, and hardcopy material rendered on acomputer printer.

While only certain preferred features of the invention have beenillustrated and described, many modifications and changes will occur tothose skilled in the art. It is, therefore, to be understood that theappended claims are intended to cover all such modifications and changesas fall within the true spirit of the invention.

1. A method of outputting information to compare at least two documents,namely, at least a first document and a second document, to facilitatevisual mapping and comparison of said documents, said documentscomprising document subsections and said subsections comprising documentsubsection headers associated therewith, comprising the steps of:mapping the first document subsections relative to the second documentsubsections based on identifying substantial similarities therebetween,to establish a subsection mapping therebetween; in relation to saidsubsection mapping and the association between said document subsectionsand said subsection headers, further mapping the first documentsubsection headers relative to the second document subsection headers toestablish a header mapping; and causing an output of at least one of thefirst document subsection headers to be juxtaposed relative to an outputof second document subsection headers mapping thereto, to visuallyemphasize said header mapping, on a computerized output device, wherein:said identifying substantial similarities therebetween includesidentifying substantial similarities between bodies of said first andsecond document subsections.
 2. The method of claim 1, said step ofmapping further comprising the steps of: combinatorially comparing allof said first document subsections against all of said second documentsubsections, deriving a similarity measure for each said combinatorialcomparison; establishing the subsection mapping of said first documentsubsections to said second document subsections based on a strength ofsaid similarity measures.
 3. The method of claim 1, said step of mappingfurther comprising the step of: regarding literally different, butsynonymous segments of said document subsections to be similar to oneanother for the purpose of identifying said substantial similarities. 4.The method of claim 1 said step of mapping further comprising the stepof: regarding literally different, but stemmed segments of said documentsubsections to be similar to one another for the purpose of identifyingsaid substantial similarities.
 5. The method of claim 1, said step ofcausing said output further comprising the step of: juxtaposing thefirst document subsection headers substantially horizontally across fromthe second document subsection headers mapped thereto.
 6. The method ofclaim 1, said step of causing said output further comprising the stepof: using a computerized server to cause said output on saidcomputerized output device remotely connected to said computerizedserver over a telecommunications link.
 7. The method of claim 1, saidstep of causing said output further comprising the step of: using acomputerized device to cause said output on said computerized outputdevice locally connected to said computerized device.
 8. The method ofclaim 1, further comprising the step of: causing a similarity measure tobe outputted, juxtaposed relative to the juxtaposed first and seconddocument subsection headers for the first and second documentsubsections for which said similarity measure summarizes a degree ofsimilarity.
 9. The method of claim 1, said step of juxtaposing saidoutput further comprising the step of: for at least one second documentsubsection that is mapped to a first document subsection relative towhich it is out of sequence, coercing the header associated with theout-of-sequence second document subsection to output juxtaposed relativeto the header of said first document subsection relative to which it isout of sequence.
 10. The method of claim 1, said step of juxtaposingsaid output further comprising the step of: for at least one seconddocument subsection that is mapped to a first document subsectionrelative to which it is differently-structured within its document,coercing the header associated with the differently-structured seconddocument subsection to output juxtaposed relative to the header of saidfirst document subsection relative to which it isdifferently-structured.
 11. The method of claim 1, further comprisingthe step of: for at least one unmapped header of a first documentsubsection determined to have no mapping with any second documentsubsection, designating a second document null subsection header to mapwith the first document unmapped header.
 12. The method of claim 11,further comprising the step of: for at least one unmapped header of asecond document subsection determined to have no mapping with any firstdocument subsection, designating a second document null subsectionheader to map, with the second document unmapped header.
 13. The methodof claim 1, said at least two documents comprising more than twodocuments; said more than two documents comprising said first document,said second document, and at least one other document, furthercomprising the steps of: mapping the first document subsections relativeto the second document subsections and the other document subsections,based on identifying substantial similarities therebetween, to establisha subsection mapping thereamong; in relation to said subsection mappingand the association between said document subsections and saidsubsection headers, further mapping the first document subsectionheaders relative to the second and other document subsection headers toestablish a header mapping thereamong; and causing an output of at leastone of said first document subsection headers to be juxtaposed relativeto an output of the second and other document subsection headers mappingtherewith, to visually emphasize said header mapping, on a computerizedoutput device.
 14. The method of claim 1, further comprising the stepsof: causing the output of a selected header pair comprising one of saidfirst document subsection headers and the second document subsectionheader mapped thereto to be simultaneously expanded into an expandedsubstructure pair output comprising a first header substructure of theselected first document subsection header and a second headersubstructure of the selected second document subsection header mappedthereto, in response to expansion selecting said selected header pairusing a computerized input device; causing the output of said firstheader substructure to be juxtaposed relative to the output of saidsecond header substructure, to visually emphasize said mapping betweensaid first header substructure and said second header substructure. 15.The method of claim 14, said step of causing the output to be expandedin response to said computerized input device further comprising thestep of. using a computerized server to cause the expansion in responseto said computerized input device remotely connected to saidcomputerized server over a telecommunications link.
 16. The method ofclaim 14, said step of causing the output to be expanded in response tosaid computerized input device further comprising the step of: using acomputerized device to cause the expansion in response to saidcomputerized input device locally connected to said computerized device.17. The method of claim 1, further comprising the steps of: startingfrom an expanded substructure pair output comprising a first headersubstructure of at least one first document subsection header and asecond header substructure of at least one second document subsectionheader mapped thereto: causing said output of said first headersubstructure and said output of said second header substructure to besimultaneously contracted into an output of the header pair comprisingsaid first and second document subsection headers, in response tocontraction selecting said expanded substructure pair using acomputerized input device.
 18. The method of claim 17, said step ofcausing the output to be contracted in response to said computerizedinput device further comprising the step of: using a computerized serverto cause the contraction in response to said computerized input deviceremotely connected to said computerized server over a telecommunicationslink.
 19. The method of claim 1 7, said step of causing the output to becontracted in response to said computerized input device furthercomprising the step of: using a computerized device to cause thecontraction in response to said computerized input device locallyconnected to said computerized device.
 20. The method of claim 1,further comprising the step of: causing the subsections associated witha selected header pair comprising one of said first document subsectionheaders and the second document subsection header mapped thereto to besimultaneously outputted in juxtaposition relative to one another tovisually emphasize a comparison therebetween, in response to comparisonselecting said selected header pair using a computerized input device.21. The method of claim 20, said step of causing the subsectionsassociated with a selected header pair to be simultaneously outputtedfurther comprising the step of: causing the first document subsection tobe juxtaposed substantially horizontally across from the second documentsubsection mapped thereto.
 22. The method of claim 20, furthercomprising the step of: causing a simultaneous scrolling through theoutputted subsections of said at least two documents in response to saidcomputerized input device.
 23. The method of claim 20, furthercomprising the steps of: causing first document subsection contextinformation to be outputted comprising at least the first documentsubsection header of said selected header pair in conjunction with itsassociated first document subsection, and causing second documentsubsection context information to be outputted comprising at least thesecond document subsection header of said selected header pair inconjunction with its associated second document subsection.
 24. Themethod of claim 20, further comprising the steps of: causing differencesidentified between the first and second document subsections sooutputted to be highlighted with difference highlighting; and causingsimilarities identified between the first and second documentsubsections so outputted to be highlighted with similarity highlightingdifferent from said difference highlighting.
 25. The method of claim 24,said highlighting said identified similarities further comprising thestep of: causing additional similarity highlighting to be activated inresponse to selecting one of said identified similarities for saidadditional similarity highlighting using said computerized input device.26. The method of claim 25, further comprising the step of: causingmovement of text as between said first and second document subsectionsto be highlighted using said additional similarity highlighting.
 27. Themethod of claim 1, further comprising the step of: causing asubstitution list to be outputted summarizing any differences identifiedbetween at least one of said first document subsections and the seconddocument subsection mapped thereto, said substitution list comprising atleast one substitution item.
 28. The method of claim 27, furthercomprising the steps of: selecting at least one of said substitutionitems as a priority substitution item in response to a computerizedinput device; and causing the selected priority substitutions to bepriority highlighted in at least one other output of the documents onsaid computerized output device.
 29. The method of claim 27, said stepof causing said substitution list to be outputted further comprising thesteps of: causing deleted matter from said first document subsectionsthat is not contained in said second document subsections to beoutputted; and causing added matter from said second documentsubsections that is not contained in said first document subsections tobe outputted.
 30. The method of claim 27, said step of causing saidsubstitution list to be outputted further comprising the steps of, forat least one identified difference: causing a numeric tally to beoutputted of how many times said identified difference occurs betweensaid first document subsections and the second document subsectionsmapped thereto; in relation to said numeric tally, if said difference isa replacement, causing specific information regarding said replacementto be outputted comprising subtracted matter from said first documentsubsections that is not contained in said second document subsections,and added matter from said second document subsections that is notcontained in said first document subsections; in relation to saidnumeric tally, if said difference is a deletion, causing specificinformation regarding said deletion to be outputted comprisingsubtracted matter from said first document subsections that is notcontained in said second document subsections; and in relation to saidnumeric tally, if said difference is an addition, causing specificinformation regarding said addition to be outputted comprising addedmatter from said second document subsections that is not contained insaid first document subsections.
 31. The method of claim 27, furthercomprising the steps of: for a selected at least one of saiddifferences, causing to be outputted at least one of said first documentsubsections and the second document subsections mapped thereto betweenwhich said differences were so-identified, in response to differenceoutput selecting the selected differences using a computerized inputdevice; and causing to be outputted go-to output highlighting theidentified differences between said first and second documentsubsections so outputted.
 32. The method of claim 1, further comprisingthe step of: causing to be outputted at least one comparison summarysummarizing a comparison of a first document subsection with the seconddocument subsection mapped thereto, juxtaposed for visual emphasisrelative to the output of the document subsection headers associatedwith the compared subsections.
 33. The method of claim 32, saidcomparison summary comprising a similarity indicator indicating a degreeof similarity between said the compared document subsections.
 34. Themethod of claim 1, further comprising the step of: text editing said atleast two documents.
 35. The method of claim 38, said step of causingthe juxtaposed output of the subsections further comprising the step of:juxtaposing the first document subsections substantially horizontallyacross from their corresponding second document subsections.
 36. Themethod of claim 38, further comprising the step of: causing asimultaneous scrolling through the outputted subsections of said atleast two documents in response to said computerized input device. 37.The method of claim 38, further comprising the step of: causing thefirst and second document subsections to be respectively outputted inconjunction with context information comprising first and secondsubsection headers associated therewith.
 38. A method of outputtinginformation to compare at least two documents, namely at least a firstdocument and a second document, to facilitate visual mapping andcomparison of said documents, comprising the steps of: causing an outputof at least one subsection of said first document to be juxtaposedrelative to an output of a corresponding at least one subsection of saidsecond document to visually emphasize a comparison therebetween, on acomputerized output device; causing differences established between theoutputted first and second document subsections to be highlighted withdifference highlighting; and causing similarities established betweenthe outputted first and second document subsections to be highlightedwith similarity highlighting different from said differencehighlighting; wherein: said step of highlighting the establishedsimilarities further comprises the step of: causing additionalsimilarity highlighting to be simultaneously activated in more than oneof said documents in response to selecting one of said establishedsimilarities in one of said documents for said additional similarityhighlighting using said computerized input device.
 39. The method ofclaim 38, further comprising the step of: causing movement of text asamong said more than one of said documents to be highlighted using saidadditional similarity highlighting.
 40. A method of outputtinginformation to compare subsections within a single document, comprisingthe steps of: causing a plurality of subsections of said document to beoutputted, using a computerized output device; causing an output of atleast two selected ones of the document subsections to be juxtaposedrelative to one another, to visually emphasize correspondencestherebetween, in response to comparison selecting the selected documentsubsections for comparison with one another using a computerized inputdevice; causing differences identified among said selected documentsubsections to be highlighted with difference highlighting; and causingsimilarities identified among said selected document subsections to behighlighted with similarity highlighting different from said differencehighlighting.
 41. The method of claim 40, further comprising the stepof: causing additional similarity highlighting to be activated inresponse to selecting one of the identified similarities for saidadditional similarity highlighting using said computerized input device.42. The method of claim 4 1, further comprising the step of: causingmultiple occurrences of text within said document to be highlightedusing said additional similarity highlighting.
 43. A method ofestablishing a mapping to compare at least two documents, namely, atleast a first document and a second document, said documents comprisingdocument subsections and said subsections comprising document subsectionheaders associated therewith, comprising the step of: mapping the firstdocument subsections relative to the second document subsections basedon identifying substantial similarities therebetween, to establish asubsection mapping therebetween; and causing an output based on saidmapping to be outputted on a computerized output device; wherein: saididentifying substantial similarities therebetween includes identifyingsubstantial similarities between bodies of said first and seconddocument subsections.
 44. The method of claim 43, further comprising thestep of: in relation to said subsection mapping and the associationbetween said document subsections and said subsection headers, furthermapping the first document subsection headers relative to the seconddocument subsection headers to establish a header mapping.
 45. Themethod of claim 43, said step of mapping further comprising the stepsof: combinatorially comparing all of said first document subsectionsagainst all of said second document subsections; deriving a similaritymeasure for each said combinatorial comparison; establishing saidsubsection mapping of said first document subsections to said seconddocument subsections based on a strength of said similarity measures.46. The method of claim 43, said step of mapping further comprising thestep of: regarding literally different, but synonymous segments of saiddocument subsections to be similar to one another for the purpose ofidentifying said substantial similarities.
 47. The method of claim 43,said step of mapping further comprising the step of: regarding literallydifferent, but stemmed segments of said document subsections to besimilar to one another for the purpose of identifying said substantialsimilarities.
 48. A method of actively highlighting similar textsegments between or among any two or more documents, namely, at least afirst document and a second document, comprising the steps of: causing afirst selected text segment within said first document to be similarityhighlighted on a computerized output device in response to selectingsaid first selected text segment using a computerized input device; andsimultaneously causing at least one other text segment within saidsecond document, similar to said first selected text segment to also besimilarity highlighted in response to said selecting said first selectedtext segment using said computerized input device.
 49. The method ofclaim 48, further comprising the step of: causing movement of text to behighlighted using said additional similarity highlighting.
 50. Themethod of claim 48, further comprising the step of: causing multipleoccurrences of text to be highlighted using said additional similarityhighlighting.
 51. The method of claim 48, further comprising the stepof: causing text segments of said text that are different from oneanother to be highlighted with difference highlighting different fromsaid similarity highlighting.
 52. A computerized system for outputtinginformation to compare at least two documents, namely, at least a firstdocument and a second document, to facilitate visual mapping andcomparison of said documents, said documents comprising documentsubsections and said subsections comprising document subsection headersassociated therewith, comprising computerized means for: mapping thefirst document subsections relative to the second document subsectionsbased on identifying substantial similarities therebetween, to establisha subsection mapping therebetween; in relation to said subsectionmapping and the association between said document subsections and saidsubsection headers, further mapping the first document subsectionheaders relative to the second document subsection headers to establishsaid header mapping; and causing an output of at least one of the firstdocument subsection headers to be juxtaposed relative to an output ofsecond document subsection headers mapping thereto, to visuallyemphasize a header mapping, on a computerized output device wherein:said identifying substantial similarities therebetween includesidentifying substantial similarities between bodies of said first andsecond document subsections.
 53. The computerized system of claim 52,said computerized means for mapping further comprising computerizedmeans for: combinatorially comparing all of said first documentsubsections against all of said second document subsections; deriving asimilarity measure for each said combinatorial comparison; establishingthe subsection mapping of said first document subsections to said seconddocument subsections based on a strength of said similarity measures.54. The computerized system of claim 52, said computerized means formapping further comprising computerized means for: regarding literallydifferent, but synonymous segments of said document subsections to besimilar to one another for the purpose of identifying said substantialsimilarities.
 55. The computerized system of claim 52, said computerizedmeans for mapping further comprising computerized means for: regardingliterally different, but stemmed segments of said document subsectionsto be similar to one another for the purpose of identifying saidsubstantial similarities.
 56. The computerized system of claim 52, saidcomputerized means for causing said output further comprisingcomputerized means for: juxtaposing the first document subsectionheaders substantially horizontally across from the second documentsubsection headers mapped thereto.
 57. The computerized system of claim52, further comprising a computerized server remotely connected to saidcomputerized output device over a telecommunications link, saidcomputerized means for causing said output further comprisingcomputerized means for: using said computerized server to cause saidoutput on said computerized output device.
 58. The computerized systemof claim 52, further comprising a computerized device locally connectedto said computerized output device, said computerized means for causingsaid output further comprising computerized means for: using saidcomputerized device to cause said output on said computerized outputdevice.
 59. The computerized system of claim 52, further comprisingcomputerized means for: causing a similarity measure to be outputted,juxtaposed relative to the juxtaposed first and second documentsubsection headers for the first and second document subsections forwhich said similarity measure summarizes a degree of similarity.
 60. Thecomputerized system of claim 52, said computerized means for juxtaposingsaid output further comprising computerized means for: for at least onesecond document subsection that is mapped to a first document subsectionrelative to which it is out of sequence, coercing the header associatedwith the out-of-sequence second document subsection to output juxtaposedrelative to the header of said first document subsection relative towhich it is out of sequence.
 61. The computerized system of claim 52,said computerized means for juxtaposing said output further comprisingcomputerized means for: for at least one second document subsection thatis mapped to a first document subsection relative to which it isdifferently-structured within its document, coercing the headerassociated with the differently-structured second document subsection tooutput juxtaposed relative to the header of said first documentsubsection relative to which it is differently-structured.
 62. Thecomputerized system of claim 52, further comprising computerized meansfor: for at least one unmapped header of a first document subsectiondetermined to have no mapping with any second document subsection,designating a second document null subsection header to map with thefirst document unmapped header.
 63. The computerized system of claim 62,further comprising computerized means for: for at least one unmappedheader of a second document subsection determined to have no mappingwith any first document subsection, designating a second document nullsubsection header to map with the second document unmapped header. 64.The computerized system of claim 52, said at least two documentscomprising more than two documents; said more than two documentscomprising said first document, said second document, and at least oneother document, further comprising computerized means for: mapping thefirst document subsections relative to the second document subsectionsand the other document subsections, based on identifying substantialsimilarities therebetween, to establish a subsection mapping thereamong;in relation to said subsection mapping and the association between saiddocument subsections and said subsection headers, further mapping thefirst document subsection headers relative to the second and otherdocument subsection headers to establish a header mapping thereamong;and causing an output of at least one of said first document subsectionheaders to be juxtaposed relative to an output of the second and otherdocument subsection headers mapping therewith, to visually emphasizesaid header mapping, on a computerized output device.
 65. Thecomputerized system of claim 52, further comprising computerized meansfor: causing the output of a selected header pair comprising one of saidfirst document subsection headers and the second document subsectionheader mapped thereto to be simultaneously expanded into an expandedsubstructure pair output comprising a first header substructure of theselected first document subsection header and a second headersubstructure of the selected second document subsection header mappedthereto, in response to expansion selecting said selected header pairusing a computerized input device; causing the output of said firstheader substructure to be juxtaposed relative to the output of saidsecond header substructure, to visually emphasize said mapping betweensaid first header substructure and said second header substructure. 66.The computerized system of claim 65, further comprising a computerizedserver remotely connected to said computerized input device over atelecommunications link, said computerized means for causing the outputto be expanded in response to said computerized input device furthercomprising computerized means for: using said computerized server tocause the expansion in response to said computerized input device. 67.The computerized system of claim 65, further comprising a computerizeddevice locally connected to said computerized input device, saidcomputerized means for causing the output to be expanded in response tosaid computerized input device further comprising computerized meansfor: using said computerized device to cause the expansion in responseto said computerized input device.
 68. The computerized system of claim52, further comprising computerized means for: starting from an expandedsubstructure pair output comprising a first header substructure of atleast one first document subsection header and a second headersubstructure of at least one second document subsection header mappedthereto: causing said output of said first header substructure and saidoutput of said second header substructure to be simultaneouslycontracted into an output of the header pair comprising said first andsecond document subsection headers, in response to contraction selectingsaid expanded substructure pair using a computerized input device. 69.The computerized system of claim 68, further comprising a computerizedserver remotely connected to said computerized input device over atelecommunications link, said computerized means for causing the outputto be contracted in response to said computerized input device furthercomprising computerized means for: using said computerized server tocause the contraction in response to said computerized input device. 70.The computerized system of claim 68, further comprising a computerizeddevice locally connected to said computerized input device, saidcomputerized means for causing the output to be contracted in responseto said computerized input device further comprising computerized meansfor: using said computerized device to cause the contraction in responseto said computerized input device.
 71. The computerized system of claim52, further comprising computerized means for: causing the subsectionsassociated with a selected header pair comprising one of said firstdocument subsection headers and the second document subsection headermapped thereto to be simultaneously outputted in juxtaposition relativeto one another to visually emphasize a comparison therebetween, inresponse to comparison selecting said selected header pair using acomputerized input device.
 72. The computerized system of claim 71, saidcomputerized means for causing the subsections associated with aselected header pair to be simultaneously outputted further comprisingcomputerized means for: causing the first document subsection to bejuxtaposed substantially horizontally across from the second documentsubsection mapped thereto.
 73. The computerized system of claim 71,further comprising computerized means for: causing a simultaneousscrolling through the outputted subsections of said at least twodocuments in response to said computerized input device.
 74. Thecomputerized system of claim 71, further comprising computerized meansfor: causing first document subsection context information to beoutputted comprising at least the first document subsection header ofsaid selected header pair in conjunction with its associated firstdocument subsection; and causing second document subsection contextinformation to be outputted comprising at least the second documentsubsection header of said selected header pair in conjunction with itsassociated second document subsection.
 75. The computerized system ofclaim 71, further comprising computerized means for: causing differencesidentified between the first and second document subsections sooutputted to be highlighted with difference highlighting; and a causingsimilarities identified between the first and second documentsubsections so outputted to be highlighted with similarity highlightingdifferent from said difference highlighting.
 76. The computerized systemof claim 75, said computerized means for causing the identifiedsimilarities to be highlighted further comprising computerized meansfor: causing additional similarity highlighting to be activated inresponse to selecting one of said identified similarities for saidadditional similarity highlighting using said computerized input device.77. The computerized system of claim 76, further comprising computerizedmeans for: causing movement of text as between said first and seconddocument subsections to be highlighted using said additional similarityhighlighting.
 78. The computerized system of claim 52, furthercomprising computerized means for: causing a substitution list to beoutputted summarizing any differences identified between at least one ofsaid first document subsections and the second document subsectionmapped thereto, said substitution list comprising at least onesubstitution item.
 79. The computerized system of claim 78, furthercomprising computerized means for: selecting at least one of saidsubstitution items as a priority substitution item in response to acomputerized input device; and causing the selected prioritysubstitutions to be priority highlighted in at least one other output ofthe documents on said computerized output device.
 80. The computerizedsystem of claim 78, said computerized means for causing saidsubstitution list to be outputted further comprising computerized meansfor: causing deleted matter from said first document subsections that isnot contained in said second document subsections to be outputted; andcausing added matter from said second document subsections that is notcontained in said first document subsections to be outputted.
 81. Thecomputerized system of claim 78, said computerized means for causingsaid substitution list to be outputted further comprising computerizedmeans for, for at least one identified difference: causing a numerictally to be outputted of how many times said identified differenceoccurs between said first document subsections and the second documentsubsections mapped thereto; in relation to said numeric tally, if saiddifference is a replacement, causing specific information regarding saidreplacement to be outputted comprising subtracted matter from said firstdocument subsections that is not contained in said second documentsubsections, and added matter from said second document subsections thatis not contained in said first document subsections; in relation to saidnumeric tally, if said difference is a deletion, causing specificinformation regarding said deletion to be outputted comprisingsubtracted matter from said first document subsections that is notcontained in said second document subsections; and in relation to saidnumeric tally, if said difference is an addition, causing specificinformation regarding said addition to be outputted comprising addedmatter from said second document subsections that is not contained insaid first document subsections.
 82. The computerized system of claim78, further comprising computerized means for: for a selected at leastone of said differences, causing to be outputted at least one of saidfirst document subsections and the second document subsections mappedthereto between which said differences were so-identified, in responseto difference output selecting the selected differences using acomputerized input device; and causing to be outputted go-to outputhighlighting the identified differences between said first and seconddocument subsections so outputted.
 83. The computerized system of claim52, further comprising computerized means for: causing to be outputtedat least one comparison summary summarizing a comparison of a firstdocument subsection with the second document subsection mapped thereto,juxtaposed for visual emphasis relative to the output of the documentsubsection headers associated with the compared subsections.
 84. Thecomputerized system of claim 83, said comparison summary comprising asimilarity indicator indicating a degree of similarity between said thecompared document subsections.
 85. The computerized system of claim 52,further comprising computerized means for: text editing said at leasttwo documents.
 86. The computerized system of claim 89, saidcomputerized means for causing the juxtaposed output of the subsectionsfurther comprising computerized means for: juxtaposing the firstdocument subsections substantially horizontally across from theircorresponding second document subsections.
 87. The computerized systemof claim 89, further comprising computerized means for: causing asimultaneous scrolling through the outputted subsections of said atleast two documents in response to said computerized input device. 88.The computerized system of claim 89, further comprising computerizedmeans for: causing the first and second document subsections to berespectively outputted in conjunction with context informationcomprising first and second subsection headers associated therewith. 89.A computerized system for outputting information to compare at least twodocuments, namely at least a first document and a second document, tofacilitate visual mapping and comparison of said documents, comprisingcomputerized means for: causing an output of at least one subsection ofsaid first document to be juxtaposed relative to an output of acorresponding at least one subsection of said second document tovisually emphasize a comparison therebetween, on a computerized outputdevice: causing differences established between the outputted first andsecond document subsections to be highlighted with differencehighlighting; causing similarities established between the outputtedfirst and second document subsections to be highlighted with similarityhighlighting different from said difference highlighting, wherein: saidcomputerized means for highlighting the established similarities furthercomprises computerized means for: causing additional similarityhighlighting to be simultaneously activated in more than one of saiddocuments in response to selecting one of said established similaritiesin one of said documents for said additional similarity highlightingusing said computerized input device.
 90. The computerized system ofclaim 89, further comprising computerized means for: causing movement oftext as among said more than one of said documents to be highlightedusing said additional similarity highlighting.
 91. A computerized systemfor outputting information to compare subsections within a singledocument, comprising computerized means for: causing a plurality ofsubsections of said document to be outputted, using a computerizedoutput device; causing an output of at least two selected ones of thedocument subsections to be juxtaposed relative to one another, tovisually emphasize correspondences therebetween, in response tocomparison selecting the selected document subsections for comparisonwith one another using a computerized input device; causing differencesidentified among said selected document subsections to be highlightedwith difference highlighting; and causing similarities identified amongsaid selected document subsections to be highlighted with similarityhighlighting different from said difference highlighting.
 92. Thecomputerized system of claim 91, further comprising computerized meansfor: causing additional similarity highlighting to be activated inresponse to selecting one of the identified similarities for saidadditional similarity highlighting using said computerized input device.93. The computerized system of claim 92, further comprising computerizedmeans for: causing multiple occurrences of text within said document tobe highlighted using said additional similarity highlighting
 94. Acomputerized system for establishing a mapping to compare at least twodocuments, namely, at least a first document and a second document, saiddocuments comprising document subsections and said subsectionscomprising document subsection headers associated therewith, comprisingcomputerized means for: mapping the first document subsections relativeto the second document subsections based on identifying substantialsimilarities therebetween, to establish a subsection mappingtherebetween; and causing an output based on said mapping to beoutputted on a computerized output device; wherein: said identifyingsubstantial similarities therebetween includes identifying substantialsimilarities between bodies of said first and second documentsubsections.
 95. The computerized system of claim 94, further comprisingcomputerized means for: in relation to said subsection mapping and theassociation between said document subsections and said subsectionheaders, further mapping the first document subsection headers relativeto the second document subsection headers to establish a header mapping.96. The computerized system of claim 94, said computerized means formapping further comprising computerized means for: combinatoriallycomparing all of said first document subsections against all of saidsecond document subsections; deriving a similarity measure for each saidcombinatorial comparison; establishing said subsection mapping of saidfirst document subsections to said second document subsections based ona strength of said similarity measures.
 97. The computerized system ofclaim 94, said computerized means for mapping further comprisingcomputerized means for: regarding literally different, but synonymoussegments of said document subsections to be similar to one another forthe purpose of identifying said substantial similarities.
 98. Thecomputerized system of claim 94, said computerized means for mappingfurther comprising computerized means for: regarding literallydifferent, but stemmed segments of said document subsections to besimilar to one another for the purpose of identifying said substantialsimilarities.
 99. A computerized system for actively highlightingsimilar text segments between or among any two or more documents,namely, at least a first document and a second document, comprisingcomputerized means for: causing a first selected text segment withinsaid first document to be similarity highlighted on a computerizedoutput device in response to selecting said first selected text segmentusing a computerized input device; and simultaneously causing at leastone other text segment within said second document, similar to saidfirst selected text segment to also be similarity highlighted inresponse to said selecting said first selected text segment using saidcomputerized input device.
 100. The computerized system of claim 99,further comprising computerized means for: causing movement of text tobe highlighted using said additional similarity highlighting.
 101. Thecomputerized system of claim 99, further comprising computerized meansfor: causing multiple occurrences of text to be highlighted using saidadditional similarity highlighting.
 102. The computerized system ofclaim 99, further comprising computerized means for: causing textsegments of said text that are different from one another to behighlighted with difference highlighting different from said similarityhighlighting.
 103. A computer-readable medium comprising a set ofinstructions executable by a computer for comparing at least twodocuments, namely, at least a first document and a second document, tofacilitate visual mapping and comparison of said documents, saiddocuments comprising document subsections and said subsectionscomprising document subsection headers associated therewith, saidcomputer-readable medium comprising one or more instructions for:mapping the first document subsections relative to the second documentsubsections based on identifying substantial similarities therebetween,to establish a subsection mapping therebetween; in relation to saidsubsection mapping and the association between said document subsectionsand said subsection headers, further mapping the first documentsubsection headers relative to the second document subsection headers toestablish said header mapping; and causing an output of at least one ofthe first document subsection headers to be juxtaposed relative to anoutput of second document subsection headers mapping thereto, tovisually emphasize a header mapping, on a computerized output device,wherein: said identifying substantial similarities therebetween includesidentifying substantial similarities between bodies of said first andsecond document subsections.
 104. The computer-readable medium of claim103, said computer-readable medium comprising at least one computer diskcomprising said set of instructions.
 105. The computer-readable mediumof claim 103, said computer-readable medium comprising a first computerdisk comprising said set of instructions and a telecommunications linkenabling said set of instructions to be transmitted over saidtelecommunications link from said first computer disk to a secondcomputer disk.
 106. A computer-readable medium comprising a set ofinstructions executable by a computer for comparing at least twodocuments, namely, at least a first document and a second document, tofacilitate visual mapping and comparison of said documents, saiddocuments comprising document subsections and said subsectionscomprising document subsection headers associated therewith, saidcomputer-readable medium comprising one or more instructions for:causing an output of at least one subsection of said first document tobe juxtaposed relative to an output of a corresponding at least onesubsection of said second document to visually emphasize a comparisontherebetween, on a computerized output device; causing differencesestablished between the outputted first and second document subsectionsto be highlighted with difference highlighting; causing similaritiesestablished between the outputted first and second document subsectionsto be highlighted with similarity highlighting different from saiddifference highlighting; wherein: said computerized means forhighlighting the established similarities further comprises computerizedmeans for: causing additional similarity highlighting to besimultaneously activated in more than one of said documents in responseto selecting one of said established similarities in one of saiddocuments for said additional similarity highlighting using saidcomputerized input device.
 107. The computer-readable medium of claim106, said computer-readable medium comprising at least one computer diskcomprising said set of instructions.
 108. The computer-readable mediumof claim 106, said computer-readable medium comprising a first computerdisk comprising said set of instructions and a telecommunications linkenabling said set of instructions to be transmitted over saidtelecommunications link from said first computer disk to a secondcomputer disk.
 109. A computer-readable medium comprising a set ofinstructions executable by a computer for outputting information tocompare subsections within a single document, said computer-readablemedium comprising one or more instructions for: causing a plurality ofsubsections of said document to be outputted, using a computerizedoutput device; causing an output of at least two selected ones of thedocument subsections to be juxtaposed relative to one another, tovisually emphasize correspondences therebetween, in response tocomparison selecting the selected document subsections for comparisonwith one another using a computerized input device; causing differencesidentified among said selected document subsections to be highlightedwith difference highlighting; and causing similarities identified amongsaid selected document subsections to be highlighted with similarityhighlighting different from said difference highlighting.
 110. Thecomputer-readable medium of claim 109, said computer-readable mediumcomprising at least one computer disk comprising said set ofinstructions.
 111. The computer-readable medium of claim 109, saidcomputer-readable medium comprising a first computer disk comprisingsaid set of instructions and a telecommunications link enabling said setof instructions to be transmitted over said telecommunications link fromsaid first computer disk to a second computer disk.
 112. Acomputer-readable medium comprising a set of instructions executable bya computer for establishing a mapping to compare at least two documents,namely, at least a first document and a second document, said documentscomprising document subsections and said subsections comprising documentsubsection headers associated therewith, said computer-readable mediumcomprising one or more instructions for: mapping the first documentsubsections relative to the second document subsections based onidentifying substantial similarities therebetween, to establish asubsection mapping therebetween; and causing an output based on saidmapping to be outputted on a computerized output device; wherein: saididentifying substantial similarities therebetween includes identifyingsubstantial similarities between bodies of said first and seconddocument subsections.
 113. The computer-readable medium of claim 112,said computer-readable medium comprising at least one computer diskcomprising said set of instructions.
 114. The computer-readable mediumof claim 112, said computer-readable medium comprising a first computerdisk comprising said set of instructions and a telecommunications linkenabling said set of instructions to be transmitted over saidtelecommunications link from said first computer disk to a secondcomputer disk.
 115. A method of outputting information to compare atleast two documents, namely, at least a first document and a seconddocument, to facilitate visual mapping and comparison of said documents,said first document comprising a first document subsection associatedwith a first document header, and said second document comprising asecond document subsection associated with a second document header anda third document subsection associated with a third document header,wherein a subsection mapping between said first document subsection andsaid second document subsection is closer than a subsection mappingbetween said first document subsection and said third documentsubsection, and wherein said first document header is more similar tosaid third document header than it is to said second document header,comprising the step of: causing an output of said first document headerto be juxtaposed relative to an output of said second document headerdespite the greater similarity between said first document header andsaid third document header, on a computerized output device.
 116. Themethod of claim 115, wherein said first document header exactly matchessaid third document header not juxtaposed relative thereto.
 117. Themethod of claim 115, wherein said first document header and said seconddocument header juxtaposed relative thereto differ from one another byat least one character.
 118. A computerized system for outputtinginformation to compare at least two documents, namely, at least a firstdocument and a second document, to facilitate visual mapping andcomparison of said documents, said first document comprising a firstdocument subsection associated with a first document header, and saidsecond document comprising a second document subsection associated witha second document header and a third document subsection associated witha third document header, wherein a subsection mapping between said firstdocument subsection and said second document subsection is closer than asubsection mapping between said first document subsection and said thirddocument subsection, and wherein said first document header is moresimilar to said third document header than it is to said second documentheader, comprising computerized means for: causing an output of saidfirst document header to be juxtaposed relative to an output of saidsecond document header despite the greater similarity between said firstdocument header and said third document header, on a computerized outputdevice.
 119. The method of claim 118, wherein said first document headerexactly matches said third document header not juxtaposed relativethereto.
 120. The method of claim 118, wherein said first documentheader and said second document header juxtaposed relative theretodiffer from one another by at least one character.
 121. Acomputer-readable medium comprising a set of instructions executable bya computer for comparing at least two documents, namely, at least afirst document and a second document, to facilitate visual mapping andcomparison of said documents, said first document comprising a firstdocument subsection associated with a first of document header, and saidsecond document comprising a second document subsection associated witha second document header and a third document subsection associated witha third document header, wherein a subsection mapping between said firstdocument subsection and said second document subsection is closer than asubsection mapping between said first document subsection and said thirddocument subsection, and wherein said first document header is moresimilar to said third document header than it is to said second documentheader, said computer-readable medium comprising one or moreinstructions for: causing an output of said first document header to bejuxtaposed relative to an output of said second document header despitethe greater similarity between said first document header and said thirddocument header, on a computerized output device.
 122. Thecomputer-readable medium of claim 121, wherein said first documentheader exactly matches said third document header not juxtaposedrelative thereto.
 123. The computer-readable medium of claim 121,wherein said first document header and said second document headerjuxtaposed relative thereto differ from one another by at least onecharacter.
 124. The computer-readable medium of claim 121, saidcomputer-readable medium comprising at least one computer diskcomprising said set of instructions.
 125. The computer-readable mediumof claim 121, said computer-readable medium comprising a first computerdisk comprising said set of instructions and a telecommunications linkenabling said set of instructions to be transmitted over saidtelecommunications link from said first computer disk to a secondcomputer disk.
 126. The method of claim 48, wherein said first andsecond documents are not equivalent to one another.
 127. The method ofclaim 48, wherein said second document is not an equivalent filegenerated from said first document and said first document is not anequivalent file generated from said second document.
 128. The method ofclaim 48, said computerized input device comprising a mouse, whereinsaid step of selecting said first selected text segment comprisespositioning a cursor over said first selected text segment withoutclicking said mouse.
 129. The computerized system of claim 99, whereinsaid first and second documents are not equivalent to one another. 130.The computerized system of claim 99, wherein said second document is notan equivalent file generated from said first document and said firstdocument is not an equivalent file generated from said second document.131. The computerized system of claim 99: said computerized input devicecomprising a mouse, wherein: said first selected text segment isselected by a cursor positioned over said first selected text segmentwithout a click of said mouse.